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PRODUCTION OF 3-HYDROXYPROPIONIC ACID IN RECOMBINANT 
ORGANISMS 



CROSS-REFERENCE TO RELATED APPLICATIONS 
This application claims priority from U.S. Provisional Patent Application S.N. 
5 60/151,440 filed August 30, 1999. 

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH 

OR DEVELOPMENT 
The research project which gave rise to the invention described in this patent 
application was supported by EPA grant R824726-01 . The United States Government 
1 0 may have certain rights in this invention. 



BACKGROUND OF THE INVENTION 
The technology of genetic engineering allows the transfer of genetic traits 
between species and permits, in particular, the transfer of enzymes from one species to 
others. These techniques have first reached commercialization in connection widi high- 
1 5 value added products such as pharmaceuticals. The techniques of genetic engineering 
are equally applicable and cost effective when e^plied to genes and enzymes which can 
be used to make basic chemical feedstocks. 

A metabolic pathway of interest exists in the bacteria Klebsiella pneumoniae^ 
which has the ability to biologically produce 3 - hydroxypropionaldehyde bom glycerol. 
20 Native microorganisms have the ability to produce 1 ,3 - propanediol bom glycerol as 
well. Commercial interests are exploring the production of 1,3 - propanediol from 
glycerol or glucose, in recombinant orgmiisms vibich have been engineered to express 
the enzymes necessary for 1 ,3 - propanediol production from other organisms. 

3 - hydroxypropionic acid CAS registry Number [503-66-2] (abbreviated as 3- 
25 HP) is a three carbon non-chiral organic molecule. The lUPAC nomenclature name for 

-1- 
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this molecule is propionic acid 3 - hydroxy. It is also known as 3 - hydroxypropionate, 
P - hydroxpropionic acid, P - hydroxypropionate, 3 - hydroxypropionic acid, 3 - 
hydroxypropanoate, hydiaciylic acid, ethylene lactic acid, p -lactic acid and 2 - 
deoxyglyceric acid. Applications of 3-HP include the manufacture of absorbable 

5 prosthetic devices and surgical sutures, incorporation into beta-lactams, production of 
acrylic acid, formation of trifluromethylated alcohols or diols, polyhydroxyalkonates, 
and co-polymers with lactic acid. 3-HP for commercial use is now commonly prodiiced 
by organic chemical syntheses. The 3-HP produced and sold by these methods is 
relatively expensive, and it would be cost prohibitive to use it for the production of 

10 monomers for polymer production. As discussed below, some organisms are known to 
produce 3-HP. However, there is not yet available a catalog of genes from these 
organisms and thus the ability to synthesize 3-HP using the enzymes natively 
responsible for the synthesis of that molecule in the native hosts which produce it does 
not now exist. 

1 5 In addition to its conunercial utility, 3-HP it is found in a number of biological 

processes, notably including many naturally occurring bio-polymers. Poly(3 - 
hydroxybutyrate) (PHB) is the most abundant member of the microbial polyesters which 
contain hydroxy monomers termed polyhydroxyalkonates (PHAs). PHB has utility as a 
biodegradable thermoplastic material and the material was first produced industrially in 

20 1982. 

The majority of published research on PHA's that contain 3-HP has concentrated 
on two bacterial sources: Ralstonia eutropha {"Alcaligenes eutrophus") and 
Pseudomonas oleovorans. Boih Ralstonia eutropha and Pseudomonas oleovorans ate 
able to grow on a nitrogen firee media containing 3 - hydroxy - propionic acid, 1 ,5 - 

25 pentanediol or 1 ,7 - heptanediol. When 3-HP is the major hydroxy-acid added to the 
growth media, poly(3 - hydroxybutyrate - co - 3 - hydroxypropionic acid) is formed 
containing 7 mol % 3 - hydroxypropionic acid. These cells also store 3 mol %, 3 - 
hydroxypropionic acid poly(3 - butyrate - co - 3 - hydroxypropionic acid). 

Recombinant systems have been used to create PHAs. An E. coli strain 

30 engineered to express PHA synthase from either Ralstonia eutropha or Zoolgoea 
ramigera produced poly (3 - hydroxypropionic acid) when feed 1,3 - propanediol. 

-2- 
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Skraly, F. A. "Polyhydroxyalkanoates Produced by Recombinant E. coli." Poster at 
Engineering Foundation Conference: Metabolic Engineering H. 1998. An E. coli strain 
that expressed PHA synthase (MBX820), when provided Mrith the genes encoding 
glycerol dehydratase and 1,3 - propanediol dehydratase from K. pneumonia , and 4 - 
5 hydroxybutyrai- CoA transferase from Clostridium kluyveri. synthesized PHB from 
glucose. 

Glycerol dehydratase, found in the bacterial pathway for the conversion of 
glycerol to 1,3 - propanediol, catalyzes the conversion of glycerol to 3 - 
hydroxypropionaldehyde and water. This enzyme has been found in a nxmiber of 

1 0 bacteria including strains of Citrobacter, Klebsiella. Lactobacillus, Entrobacter and 
Clostridium. In the 1,3 - propanediol pathway a second enzyme 1,3 - propanediol 
oxido-reductase (EC 1.1.202) reduces 3 - hydroxypropanaldehyde to 1,3 - 
propanediol in a NADH dependant reaction. The pathway for the conversion of 
glycerol to 1,3 - propanediol has been expressed in £. coli. Tong et al.. A pplied and 

15 Environmental Microbiology 57 (12) 3541-3546. The genes responsible for the 
production of 1,3 - propanediol were cloned from the dha regulon of Klebsiella 
pneumoniae. Glycerol is transported into the cell by the glycerol facilitator, and then 
converted into 3 - hydroxy - propionaldehyde by a coenzyme B^- dependmt 
dehydratase. £. coli lacks a native tffia regulon, consequently E. coli cannot grow 

20 anaerobically on glycerol without an exogenous electron acceptor such as nitrate or 
frmiarate. 

Aldehyde dehydrogenases are enzymes that catalyze the oxidation of aldehydes 
to carboxylic acids. The genes encoding non-specific aldehyde dehydrogenases have 
been identified in a wide variety of organisms e.g.; ALDH2 fix>m Homo sapiens, ALD4 
25 firom &iccharomyces cerevisiae, and from E. coli both aldA and aldB, to name a few. 
These enemies are classified by co-fiictor usage, most require either AND^, or NADP* 
and some will use either co-frictor. Hie genes singled out for mention here are able to 
act on a number of different aldehydes and it likely that they may be able to oxidize 3 - 
hydroxy - propionaldehyde to 3 - hydroxypropionic acid. 



-3- 



wo 01/16346 



PCTAJSOO/23878 



BRIEF SUMMARY OF THE INVENTION 
The present invention is intended to permit the creation of a recombinant 
microbial host which is capable of synthesizing 3 -HP from a starting material of 
glycerol or glucose. The glycerol or glucose is converted to 3 - 

S hydroxypropionicaldehyde (abbreviated as 3-HPA) which is then converted to 3-HP. 
This process requires the so-called dhaB gene from Klebsiella pneumoniae which 
encodes the enzyme glycerol dehydratase any one of four different aldehyde 
dehydrogenase genes to convert 3-HPA to 3-HP. The four aldehyde dehydrogenase 
genes used were aldA from the bacterium E. coli,ALDH2 from humans, ALD4 from the 

1 0 yeast Sacchcu^omyces cerevisiae, and aldB from £ coli. The yeast gene appeared to give 
the best results. 

It is an object of the present invention to provide a genetic construct which . 
encodes glycerol dehydratase and aldehyde dehydrogenase enzymes necessary for the 
production of 3 - hydroxypropionic acid from glycerol. 
15 It is also an object of the present invention to provide a method for the 

production of 3 - hydroxypropionic acid from glycerol. 

Other features and advantages of the invention will be apparent from the 
following description of the preferred embodiment thereof and from the claims. 

20 BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 
Not applicable. 

DETAILED DESCRIPTION OF THE INVENTION 
It is disclosed here that it is possible to introduce into a bacterial host genes 
encoding two enzymes and thus confer upon that host the ability to produce 3-HP from 
2S glyceroL The two necessary enzymes are glycerol dehydratase and aldehyde 
dehydrogenase. It is here reported that the two enzymes are both necessary and 
sufEicient to enable a strain of a suitable host, such as a competent E. coli strain, to make 
3-HP from glycerol. An exemplary gene encoding a glycerol dehydratase is known, the 
dhaB gene from Klebsiella pneumoniae, sequenced and rendered convenient to use. 
30 Several exemplary aldehyde dehydrogenases are known, and their sequences are 



wo 01/16346 



PCT/USOO/23878 



presented here. From this information, it becomes practical to confer upon a bacterial 
host the ability to convert glycerol into 3-HP in a commercially reasonable manner. 

It was not apparent before the completion of the woric described here that these 
two diverse en^rmes could be produced in a common host to produce the ability to 

S make 3-HP. There are many known aldehyde dehydrogenase enzymes and genes, and 
the enzymes are known to have varying substrate specificities and efficiencies. There 
was not evidence, prior to the work described here, that the aldehyde dehydrogenase 
enzyme would work on the 3-hydroxypropionicaldehyde (3-HP A) substrate to create 3- 
HP. Without that knowledge, there was no data from which to predict the effectiveness 

10 of the 3-HP production studies described below. An additional uncertainty arises from 
the fact that the intermediate aldehyde, 3-HPA, is toxic to many bacterial host and thus 
the survival of the host is dependent upon the relative rates of enzymatic production and 
conversion of the aldehyde intermediate to non-toxic 3-HP. 

A difficulty in the realization of the production of 3-HP desired here is that 

15 ribosome binding sites fi'om non-native hosts are often ineffectual and lead to poor 
protein production and that many non-native promoters are often poorly transcribed and 
a bar to high protein expression. However, the inventors also recognized that a non- 
native promoter that is known to be very active and is inducible by the addition of a 
small molecule unrelated to the pathway being expressed is often a very efBcient way to 

20 esqiress and regulate the levels of enzymes expressed in hosts such as E. coli. To 
achieve high levels of regulated gene expression plasmids were constructed which 
placed the expression of all exogenous genes necessary for the production of 3 - 
hydroxypropionic acid from glycerol under the regulation of the trc promoter. The trc 
promoter, is efficient, not native to E. coli, and inducible by the addition of IPTG. 

25 The present specification describes a genetic construct for use in the production 

of 3 - hydroxypropionic acid from glycerol. The genetic construct includes exemplary 
DNA sequences coding for the expression of a glycerol dehydratase and a DNA 
sequence coding for aldehyde dehydrogenase. The set of exemplary sequences 
necessary for the expression of glycerol dehydratase is collectively referred to as 

30 "dhaB". The set of sequences necessary for the expression of aldehyde dehydrogenase 
includes any one of four different genes which proved efficacious. The individual 
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aldehyde dehydrogenase sequences refeired to individually as ALDH4, ALD2, oldA and 
aldB. 

Producing 3 - hvdro?cvPTDDionic acid in a foreign host 

In the woric described below, the enzymes necessary for the production of 3 - 

5 hydroxypropionic acid from glycerol in E. coli were expressed under the regulation of 
the trc promoter, a non-native promoter inducible by the addition of IPTG. The 
glycerol dehydratase was encoded by the dhaB gene from Klebsiella pneumoniae, the 
aldehyde dehydrogenases used was any one of four different genes (ALDH2 from Homo 
sapiens, ALD4 from S. cerevisiae, aldB from E. coli or aldA from E. coli). Expression 

1 0 of these genes coding for glycerol dehydratase and any one of the genes encoding an 
aldehyde dehydrogenases was sufficient to enable the construct to produce 3-HP when 
the fermentation media was supplemented with glycerol. In all of these constructs, the 
dhaB gene was downstream from the gene encoding the aldehyde dehydrogenase used, 
and expression of both genes was regulated by the trc promoter. Hiis order, however, is 

1 5 not required and the order of the gens on a construct and the use of multiple constructs is 
possible. 

In a minimal genetic construct made based on the data presented here, the only 
genetic elements present that would be necessary are the structural genes dhaB and an 
aldehyde dehydrogenase gene encodmg a protein that efficiently catalyzes the oxidation 

20 of 3-hydroxypropionaldehyde to 3-hydroxypropionic acid, and non-native promoter 
sequences specifically selected to give the type of inducible control most appropriate for 
the context of the process in which the construct is to be used. Extraneous pieces of 
DNA, whether retained in the construct or added from other DNA sequences, would not 
necessarily be detrimental to effective 3-HP synthesis by the host organism, but would 

25 not be needed. Each sequence to be translated >^ould necessarily be preceded by a 
ribosome binding site, functional in the selected host so that the messenger RNA(s) 
coding for the proteins of interest could be translated by ribosomes. Terminator 
sequences immediately downstream of each translated unit would also be necessary in 
some organisms, particularly in eukaryotes. The construct could be part of an 

30 autonomously replicating sequence, such as a plasmid or phage vector, or could be 
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integrated into the genome of the host. 

The structural genes and appropriate promoter(s) could be isolated by the use of 
restriction en^rmes, by the polymerase chain reaction (PCR), by chemical synthesis of 
the qjpropriate oligonucleotides, or by other methods apparent to those skilled in the art 

5 or molecular biology. The promoter(s) would be derived fiom genomic DNA of other 
organism or fix)m artificial genetic constructs containing promoters. Appropriate 
promoter fragments would be ligated into the construct upstream of the structural genes 
in any one of several possible arrangements. 

The aldehyde dehydrogenase expressed would have: high specific activity 

1 0 towards 3-hydroxypropionaldehyde; be very stable in the host it is expressed in; be 
readily over expressed in the selected host; not be inhibited by either the substrates 
necessary for the reaction or the products formed by the reaction; be fully active under 
the fermentation conditions most favorable for the production of 3 - hydroxypropionic 
acid and be able to use either NAD* or NADP*. 

15 One possible arrangement is the true operon, where one promoter is used to 

direct transcription in one direction of all necessary Open Reading Frames (ORFs). The 
entire message is then contained in one messenger RNA. The advantages of the operon 
are that it is relatively easy to construct, since only one promoter is needed; that is it is 
relatively simple to replace the promoter with another promoter if that would be 

20 desirable later; and that it assures that the two genes are under the same regulatioiL The 
main disadvantage of the operon scheme is that the levels of the expression of the two 
genes cannot be varied independently. If it is found that the genes, for optimal 3 - 
hydroxypropionic acid synthesis, should be expressed at different levels, the operon in 
most cases cannot be used to realize this. 

25 Another possible arrangement is the multiple-promoter scheme. Two or more 

promoters, with the same or distinct regulatory behavior, could be used to direct 
transcription of the genes. For example, one promoter could be used to direct 
transcription of dhaB and one to direct transcription of the gene encoding the 
appropriate aldehyde dehydrogenases. Because the genes theoretically can be 

30 transcribed and translated separately, a great number of combinations of multiple 

promoters is possible. Additionally, it would be most desirable to prevent the promoters 
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fix>m interfering with one another. This could be achieved either by placing two 
promoters into the construct such that they direct transcription in opposite directions, or 
by inserting transcriptional terminator sequences downstream of each separately 
transcribed unit The main advantage of the multiple-promoter construct is that it 
S permits independent regulation of as many distinct units as desired, which could be 
important. The disadvantages are that it would be more difficult to construct; more 
difficult to amend later; and more difficult to effectively regulate, since multiple 
changes in fermentation conditions would need to be introduced and might render the 
performance of the fermentation somewhat less predicable. 

10 In any construct, the promoter sequence(s) used should be functional in the 

selected host organism and preferably provide sufficient transcription of the genes 
comprising the glycerol to 3 - hydroxypropionic acid pathway to enable the construct to 
be adequately active in that host. The promoter sequence(s) used would also effect 
regulation of transcription of the genes enabling the glycerol to 3 -HP pathway to be 

1 5 adequately active imder the fermentation conditions employed for 3 -HP production, and 
preferably they would be inducible, such that expression of the genes could be 
modulated by the inclusion in, or exclusion from, the fermentation of a certain agents or 
conditions. 

A plausible example of the use of such a construct follows: one promoter, which 
20 induced by the addition of an inexpensive chemical (the inducer) to the medium, could 
control transcription of both the d/raB gene and the gene encoding the appropriate 
aldehyde dehydrogenase. The cells would be permitted to grow in the absence of the 
inducer until they accumulated to a predetermined level. The inducer would then be 
added to the fermentation and nutritional changes commensurate with the altered 
25 metabolism would be made to the medium as well. The cells would then be permitted to 
utilize the substrate(s) provided for 3-HP production (and additional biomass production 
if desired). After the cells could no longer use substrate to produce 3-HP, the 
fermentation would be stopped and the 3-HP recovered. 

Genetic Sequences 

30 To express glycerol dehydratase and a suitable aldehyde dehydrogenase, the two 
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enzymes necessary for the production of 3 - hydioxypropionic acid from glycerol, it is 
required that the DNA sequences containing the glycerol dehydratase and aldehyde 
dehydrogenase coding seqtiences be combined with at least a promoter sequence 
(preferably a non-native promoter although some native promoter activity may be 
5 present). An exemplary method of construction is described in the example below. To 
ensure that the present specification is enabling, the fiill sequences of the coding regions 
of genes for these enzymes is presented here. 

Sequences 1,3,5 and 7 present different native genomic sequences for genes 
encoding aldehyde dehydrogenases. 
1 0 SEQ ID NO: 1 contains the full native DNA sequence encoding the ALD4 

enzyme fix>m Saccharomyces cerevisiae. The amino acid sequence of the protein is 
presented as SEQ ID N0:2. 

SEQ ID N0:3 includes the DNA sequence for the human ALDH2 gene, again 
including the full protein coding region. The amino acid sequence for this himan 
15 alcohol dehydrogenase is presented in SEQ ID N0:4. 

SEQ ID NO: 5 and 7 respectively present the fiill coding sequences from the E. 
coli genes aldA and aldB, both of which encode alcohol dehydrogenases. The amino 
acid sequences for the proteins encoded by the genes are presented in SEQ ID NO: 6 
and 8 respectively. 

20 SEQ ID N0:9 contains the native genomic DNA sequence for the dhaB gene 

from the dha regulon of Klebisiella pneumoniae. The coding sequences for this 
complex regulon produces five polypeptides, which are presented as SEQ ID NOS:10 
through 13, which together provide the activity of the glycerol dehydratase enzyme. 
Each of these coding sequences can be used to make genetic constructs for the 

25 expression of the appropriate enzymes m a heterologous hosts. In making genetic 

constructs for expression of the genes in such hosts, it is contemplated that heterologous 
promoters will be joined to the coding sequences for the enzymes, but all that it required 
is that the promoters be effective for the hosts in which the genes are to be expressed. It 
is also contemplated and envisioned that significant variations in DNA sequence are 

30 possible from the native DNA coding sequences presented here. As is well known in 
the art, due to the degeneracy of the genetic code, many different DNA sequences can 
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encode the expression of the same protein. So, when this document uses language 
specifying a DNA sequence encoding a protein, it is intended to encompass any DNA 
sequence which can be used to express that protein even if difTerent fiom the genomic 
sequences presented here. It is also contemplated that conservative changes in the 

5 amino acid sequences of the proteins specified here can be made without departing fcom 
the present invention. In particular, deletions, additions and substitutions of one or more 
amino acids in a protein sequence can ahnost always be made without changing protein 
functionality. When the name of a protein is sued here, it is intended to be equally 
applicable to both such minor changes in amino acid sequence and to allelic variations 

1 0 in native protein sequence as occurs v^thin the species named as well as other closely 
related species. 

It is possible that many of the above DNA sequences could be truncated and still 
express a protein that has the same enzymatic properties. One skilled in the art of 
molecular biology would appreciate that minor deletions, additions and mutations may 

1 5 not change the attributes of the designated base pair sequences; many of the nucleotide 
of the designated base pair sequences are probably not essential for their unique 
function. To determine whether or not an altered sequence or sequences has sufficient 
homology with the designated base pairs to function identically, one would siiiq)ly 
create the candidate mutation, deletion or alteration and create a gene construct 

20 including the altered sequence together with promoter and termination sequences. This 
gene construct could be tested as, described below, for the production of 3-HP from 
glycerol. 

Certain DNA primers were used to isolate or clone the genomic DNA sequences 
used in the experiments described below. While the sequence information presented 
25 here is sufficient to enable the construction of expression plasmids incorporating the 
genes identified here, in order to redundantly enable the use of these genes, primers 
which may be used to isolated the genes fmm then: native hosts are described below. 

The primers aldA_L (SEQ ID NO: 14), and aldA_R (SEQ ID N0:15), were used 
to amplify the 1513 bp aldA fiagment fix)m genomic E. coli DNA (strain MG1655, a 
30 gift fmm the Genetic Stock Center, New Haven, CT). The gel purified PCR fiagment 
containing a DNA sequence coding for the expression of aldehyde dehydrogenase was 
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inserted into Ncol-Xhol site of pSE380 (Invitrogen, San Diego, CA) to give pPFS3. The 
resxilting plasmid contained aldK under the control of the trc promoter. This construct 
allowed for high-level expression of the aldA. gene from E. coli under regulation of the 
trc promoter. Unless indicated otherwise all molecular biology and plasmid 

5 constructions were done in E. coli AGl (Stratagene, La Jolla, CA). 

The primers aldB_L (SEQ ID NO:20) and aldB_R (SEQ ID N0:21), were used 
to amplify the 1574 bp ald& fragment from genomic £. coli DNA (strain MG1655). 
The resulting PGR converted the TGA stop codon into a TAA stop codon. The gel- 
purified PGR fragment containing the DNA sequence sufficiently coding for the 

1 0 expression of aldehyde dehydrogenase was inserted into the KpnlSacl site of pSE3 80 to 
give pPFS12. 

The primers ALD4_L (SEQ ID NO : 16), and ALD4_R (SEQ ID NO : 17), were 
used to amplify the 1595 bp ALD4 fragment from S. cerevisiae DNA (strain YPH500). 
The gel-purified fragment containing a DNA sequence coding for the expression of 

1 5 aldehyde dehydrogenase was inserted into the KpnlSacl site of pPFS3 to give pPFS8. 
The resulting plasmid contained Ttta.taKALD4 under control of the frc promoter. 

The primers ALDH2_L (SEQ ID N0:18), and ALDH2_R (SEQ ID N0:19), 
were used to amplify the 1541 bp ALDH2 fragment from pT7-7::ALDH2, a gift from 
H. Weiner (Purdue University, West Lafeyette, IN). The gel purified PGR fragment 

20 containing a DNA sequence sufficiently homologous to base pairs 22 to 1 524, inclusive 
of SEQ ID NO : 3 so as to code for the expression of aldehyde dehydrogenase was 
inserted in to the KpnlSacl site of pSE380 to give pPFS7. This sequence was moved 
ftom pPFS7 into the KpnlSacl site of pPFS3 to give pPFS9. The resulting plasmid 
contained mature ALDH2 under the control of the trc promoter. 

25 The primers pTRG_L (SEQ ID NO:22), and pTRG_R )SEQ ID NO:23), were 

used to amplify the 540 bp fragment from pSE380. The gel purified PGR fragment was 
inserted into the Hpal-Kpnl site of pPFS3 to give pPFSI3. The resulting plasmid 
deleted the "native" ribosome binding site of pSE380 and a A^col site (which contained 
an extraneous ATG start codon upstream of the cloned genes). The IQmlSacl 

30 fragments of pPFS8, pPFS9, and pPSF12 were inserted into the I^nlSacl site of 
pPFS13 to give pPFS14, pPFSlS, and pPFS16, respectively. 
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Assay for nroduction of 3-HP 

The efficacy of changes made as contemplated herein can be checked by the 
following tests. To test for the production of 3-HP, fermentation products can be 
quantified with a Waters Alliance Integrity HPLC system (Milford, MA) equipped with 
5 a refractive index detector, a photodiode array detector, and an Aminex HPX-87H (Bio- 
Rad, Hercules, CA) organic acids column. The mobile phase should be 0.01 N sulfuric 
acid solution (pH 2.0) at a flow rate of 0.5 mL/min. The colunm temperature should be 
set to 40''C. Compounds can be identified by determining if they co-elute with 
authentic standards. Prior to analysis, all samples should be filtered through 0.45 jiM 
10 pore size membrane. (Gelman Sciences, Ann Arbor, MI). The fractions of the 

fermentation products collected using HPLC should be analyzed on a Varian Star 3400 
CX, gas - chromatograph coupled to a Varian Saturn 3 mass spectrometer (GC-MS) 
(Walnut Creek, CA). 

Assay for enzvme activity. 

1 5 Aldehyde dehydrogenase activity can be determined by measuring the reduction 

of p-NAD* at 25 °C with 3 - hydroxypropionaldehyde as a substrate. All buffers should 
contain 1 mM ethylenediaminetetraacetic acid (EDTA), 0.1 mM Pefabloc SC 
(Boehringer Mannheim, Indianapolis, IN) and 1 mM Tris (carboxyethyl) phosphme 
hydrochloride (TCEP-HCL). For ALD4, the solution should contain 100 mM Tris HCL 

20 Buffer (pH 8.0), 100 mM KCl. For ALDH2 the solution should contained 100 mM 
sodium pyrophosphate (pH 9.0). For AldA and AldB, the solution should contain 20 
mM sodium glycine (pH 9.5). A total of 3.0 mL of bufifer should be added to quartz 
cuvettes and allowed to equilibrate to assay temperature. From 5 to 20 of ceil extract 
should be added and background activity recorded after the addition of P-NAD"^ to a 

25 final concentration of 0.67 mM. The reaction should be started by the addition of 
substrate (either acetaldehyde, propionaldehyde, or 3 - hydroxypropionaldehyde) to a 
final concentration of 2 mM. Assay mixtures should be stirred with micro-stirrers 
during the assays. 

For aldehyde dehydrogenase activity assays, one unit is defined as the reduction 
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of 1 .0 nM of P-AND* per minute at 25° C. These reactions can be monitored by 
following the change in absorbence at 340 nm (A340) at 25 "C on a Varain Carry-l Bio 
spectrophotometer (Sugar Land, TX). Total protein concentrations in the cell extracts 
can be determined using the Bradford assay method (Bio-Rad, Hercules, CA) with 
5 bovine serum albumin as the standard. 

EXAMPLES 

Plasmid constructions. 

Klebsiella pneumoniae expresses glycerol dehydratase, an enzyme that catalyzes 
the conversion of glycerol to 3 - hydroxypropionaldehyde, (dhaB) and 1,3 - 

1 0 propanediol oxidoreductase an enzyme that catalyzes the conversion of 3 - 

hydroxypropionaldehyde to 1,3 - propanediol respectively (the gene product from 
dhaT). A plasmid encoding these two genes was created and expressed in E. coli 
(plasmid pTC53). Thed/wrgene was deleted from pTC53 to create pMH34. The 
resulting plasmid still contained the DNA sequence complementary to base pairs 330 to 

15 2153 inclusion of SEQ ID NO : 9, the complement of base pairs 2166 to 2591 , 

inclusive, of SEQ ID NO : 9, and the complement of base pairs 3191 to 4858, inclusive, 
of SEQ ID NO : 9, so as to code for the expression of glycerol dehydratase. The 
fragment of DNA encoding these sequences was excised from pMH34 by cutting it with 
Sall-Xbal, and the resulting fragment was gel purified (the purified firagment was gift 

20 fromM. Hoffman ofthe University ofWisconsin- Madison). This DNA fragment was 
inserted into the Sall-Xbal site of pPFS13 to give pPFS17. 

The resulting plasmid contained both the aldA and dhaB genes under the control 
of the trc promoter. Similarity, the gel-purified SaR-XbdL fragment from pMH34 was 
inserted into the SaR-Xbal sites of pPFS14, pPFS15, and pPFS16 to give pPFS18, 

25 pPFS19, and pPFS20, respectively. These plasmids contained ALDH2, and 
aldBy respectively, as well as dhaB under the control of the trc promoter; in all of the 
constructs the dhaB gene were downstream of the gene encoding the aldehyde 
dehydrogenase. 
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Expression in E. coli. 

The efficacy of E. coli as a platform for the production of 3-HP from growth on 
glucose has been exammed using a mathematical model developed for this purpose. The 
model was executed in two dififerent ways assuming the conversion of one mole of 

5 glucose under either anaerobic or aerobic conditions either directly to 3-HP or to the 
production of 3-HP and ATP. The optimum yield vinder anaerobic conditions is 1 mole 
of 3-HP and 1 mole of lactate. The more realistic yield under anaerobic conditions is 
0.5 moles of 3-HP, 1.5 moles of lactate and 1 mole of ATP. The optimum yield under 
aerobic conditions is 1 .9 moles of 3-HP and 0.3 moles of COj. The realistic yield under 

10 aerobic conditions is 1.8S moles of 3-HP, 0.35 moles of CO2 and 1 mole of ATP. 
The effect of 3-HP concentration on £. coli strain MG16SS growth was 
measured. Cells were grown on standard media with and without the addition of up to 
80g/L of 3-HP. The best fit of these data demonstrated that 3-HP was only 1.4 times as 
inhibitory as lactic acid on the growth of E. coli. It is possible to economically produce 

1 5 lactic acid using £. coli, since 3-HP is only 1 .4 times more inhibitory than lactic acid, 
it should be possible to use E. coli as a host for the commercial production of 3-HP. 

Media and growth conditions 

The standard media contained the following per liter: 6 g Na2HP04, 3 g KH2PO4, 
1 g NH4CI, 0.5 g NaCl, 3 mg CaClj, 5 g yeast extract (Difco Laboratories, Detroit, MI) 

20 and 2 mM MgSO^. When necessary to retain plasmids ampicillin (1 00 mg/mL) was 
added to the media. Isopropyl-p-thiogalactopyranoside (IPTG) was added in varying 
amounts to induce gene expression. All femientations were carried out in an incubator- 
shaker at 37 C and 200 ipm. Anaerobic fermentations were carried out in 500-mL 
anaerobic flasks with 300 mL of working volume. Inocula for fermentations were 

25 grown overnight in Luria-Bertani medium supplemented with ampicillin is necessary. 
The 300-mL fermentations were inoculated with 1.5 mL of the overnight culture. For 
enzyme assays, fermentations were incubated for 24 hours. 

Over expression of aldehyde dehydrogenase in E, coli. 

Cells were harvested by centrifugation at 3000 x g for 10 minutes at 4''C with a 
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Beckman (Fullerton, CA) model J2-21 centrifuge. Cell pellets were washed twice in 
100 mM potassium phosphate buffer at pH 7.2 and re-suspended in appropnste assay 
resuspension buffer equal to S x of the volume of the wet cell mass. The cells were 
homogenized using a French pressure cell. The homogenate was centrifuged at 40000 x 
S g for 30 minutes. The supernatant was dialyzed against the appropriate resuspension 
buffer using 10000 molecular weight cut-off pleated dialysis tubing (Pierce, Rockford, 
IL) at 4"'C. Dialysis buffer was changed after 2 hours, and 4 hours, and dialysis was 
stopped after being allowed to proceed overnight. 

E. coli AGl cells transfected widi the plasmids constructed to express the cddA, 

1 0 ALD4, ALDH2, or aldB genes were grown in SOO-mL anaerobic flasks. Twelve hours 
after the fermentations were inoculated IPTG was added to induce en^rme expression. 
The cells were allowed to grow for an additional 12 hours then harvested and lysed as 
discussed above. The soluble fraction of the lysate was assayed for aldehyde 
dehydrogenase activity using the substrate 3-hydroxypropionicaldehyde in the buffer 

1 5 appropriate for the particular enzyme expressed The plasmid, aldehyde dehydrogenase 
expressed and specific activity measured (U/mg of protein) were as follows: pPFS13, 
aldA, 0.2; pPFS14, ALD4, 0.5, pPFS15, ALDH2, 0.3; and pPFS16, aldB. 0.1. Hie 
control, K coli staain AGl harboring plasmid pSE380, encoded no exogenous aldehyde 
dehydrogenase activity and it had no detectable activity with 3-HP as substrate. It is 

20 clear from the activity assays that all four aldehyde dehydrogenases were expressed in 
E. coli. The aldehyde dehydrogenase cloned fmm Saccharomyces cerevisiae (ADH4) 
had the highest activity when 3-hydroxypropionaldehyde vras used as the substrate (0.5 
units/mg of protein). 

E. coli cells transformed with plasmids expressing: aldehyde dehydrogenase; 

25 both aldehyde dehydrogenase and glycerol dehydratase, or neither gene; were grown 
and assayed for their ability to produce 3-HP fix)m glycerol. The cells were grown on 
standard media supplemented with 6 ^M of Coenzyme B,2, under anaerobic conditions 
in the absence of light (to protect the integrity of the Coenzyme 3,2 necessary for DhaB 
activity). After 12 hours, IPTG was added to induce expression of the genes tmder the 

30 trc promoter at the same time 5g/L of glycerol was added. After 12 more hours of 

anaerobic fermentation the fermentation broth was assayed for 3 - HP by HPLC and GC, 
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the plasmid, aldehyde dehydrogenase gene expressed and g/L of 3- HP measured were 
as follows: pSF17, aldA, 0.031; p?SmALD4, 0.173; and pPSF19, ALDH2, 0.061. 
Cells expressing dhaB but no exogenous aldehyde dehydrogenase genes (plasmid 
pMH34) produced 0.015 g/L of 3 - HP. Cells expressing aldA, ALD4, ALDH2 or aldB 
5 but not dhaB (plasmids pPFS13, pPFS14, pPFSlS, pPFS16, respectively) all produced 
less then O.OOS g/L of 3-HP when the media the cells were grovWng in was 
supplemented with 2.Sg/L of 3-hydroxypropionaldehyde. 

Other Hosts and Promoters 

Applications of the 3 - hydroxypropionic acid pathway such as the genetic 

10 constructs of the present invention can easily be expressed in other organisms. The 
required genes would need to be placed under control of an appropriate promoter or 
promoters. Some organism such as yeasts may require transcription terminators to be 
placed after each transcribed unit The knowledge of the present intention makes such 
amendments possible. Such a genetic construct woidd need to be part of a vector that 

1 5 could either replicate in the new host or integrate into the chromosome of the new host. 
Many such vectors are commercially available for expression in gram-negative and 
gram-positive bacteria, yeast, mammalian cells, insect cell, plant, etc. For example, to 
express the 3-hydroxypropionic acid pathway in Rhodobacter capstdatus, one could 
obtain vector pNH2 from the American Type Culture Collection ( ATTC). This is a 

20 shutde vector for use m R. capsulatus and E. coli. Organisms such as Saccharomyces 
cerevisiae which can convert glucose to glycerol could be used as a host, such a 
construct would enable the production of 3 - HP directly from glucose. Additionally, 
other substrates such as xylan could also be used given the selection of an appropriate 
host 

25 Stochiometric analysis shows that best stochiometric yield of 3-HP production in 

K coli calculated on the basis of glucose consumed is obtained under aerobic 
conditions. Under aerobic condition CO2 is the only carbon-containing co-product in 
particular the generation of lactic acid which occurs under anaerobic conditions is 
avoided. Production of 3-HP under these conditions could result in a more economical 

30 recovery of 3-HP from the fermentation broth. 
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Alternatively, the dhaB gene and a gene encoding the {qipropriate aldehyde 
dehydrogenase could be cloned into the multiple cloning site of this vector in E. coli to 
facilitate construction, and then transformed into R. ccpsulatus. The R. capsulatus nifH 
promoter, provided on the plasmid, could be used to direct the transcription in R. 
5 cajxsulatus of the genes placed mto pNF2 in series with one promoter, or with two 
copies of the nifH promoter. Expression of the genes in other organisms would require 
a procedure analogous to that presented here. 

Alternative Aldehyde Dehydrogenases and Glycerol Dehydratases 

Applications of the pathway for the production of 3-hydroxypropiomc acid from 

10 glycerol can be made using other suitable aldehyde dehydrogenases. To be functional in 
this pathway an aldehyde dehydrogenase needs to be stable, readily expressed in the 
host of choice and have high enough activity towards 3-hydroxypropionaldehyde to 
enable it to make 3-HP. The knowledge of the present invention makes such 
amendments possible. A program of directed evolution could be undertaken to select 

1 5 for suitable aldehyde dehydrogenases or they could be recovered from native sources, 
the genes encoding these enzymes in conjunction with a gene encoding an appropriate 
glycerol dehydratase activity, would then be made part of any of the constructs 
envisioned here to produce 3 - hydroxypropionic acid from glycerol. 

A similar program of enzyme improvement including for example directed 

20 evolution could be carried out using the dhaB gene from Klebsiella pneumoniae as a 
starting point to obtain other variants of glycerol dehydratase that are superior in 
efficiency and stability to the form used in this invention. Alternatively, enzymes which 
catalyzes the same reaction may be isolated from others organisms and used in place of 
the Klebsiella pneumoniae glycerol dehydratase. Such en^mes may be especially 

25 useful in alternative hosts wherein they may be more readily expressed, be more stable 
and more efficient under the fermentation conditions best suited to the growth of the 
construct and the production and recovery of 3-HP. 
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CLAIM OR CLAIMS 

I/WE CLAIM: 

1 . A method for producing 3-hydroxypropionic acid comprising the steps of 
providing in a fermenter a recombinant microorganism v^ch expresses genes 

5 for non-native enzymes which are capable of catalyzing the production of 3- 
hydroxypropionic acid from glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

fermenting the microorganism under conditions which result in the accumulation 
10 of 3-hydroxypropionic acid. 

2. A method for producing 3-hydroxypropionic acid comprising the steps of 
providing in a fermenter a recombinant microorganism which carries genetic 

constructions for the expression of a glycerol dehydratase and an aldehyde 
dehydrogenase which are capable of catalyzing the production of 3-hydroxypropiomc 
IS acid from glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

fermentmg the microorganism under conditions which result in the accumulation 
of 3-hydroxypropionic acid. 
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3. A method for producing 3-hydroxypropiomc acid comprising the steps of 
providing in a fennenter a recombinant microorganism which carries a genetic 
construct which expresses the dhaB gene from Klebsiella pneumoniae and a gene for an 
aldehyde dehydrogenase, which are capable of catalyang the production of 3- 
5 hydroxypropionic acid fiom glycerol; 

providing a source of glycerol or glucose for the recombinant microorganism, 

and 

fermenting the microorganism imder conditions which result in the accumulation 
of 3-hydroxypn)pionic acid. 

10 4. The method of claim 3 wherein the gene for the aldehyde dehydrogenase is 

selected &om the group consisting of ALDH4, ALD2, aldA and aldB. 

5. The method of claim 3 wherein the aldehyde dehydrogenase is selected from 
the group consisting of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6 and SEQ ID N0:8. 

6. A recombmant E. coli host comprising in its inheritable genetic materials 
1 S foreign genes encoding a glycerol dehydratase and an aldehyde dehydrogenase, such 

that the host is capable of producing 3-hydroxypropionic acid from glycerol. 

7. A recombmant E. coli host comprising in its inheritable genetic materials the 
cOiaB gene from Klebsiella pheumoniae and the ald4 gene from Saccharomycetes 
cervisiae, such that the host is capable of producing 3-hydroxypropionic &om glycerol. 
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8. A bacterial host comprising in its inheritable genetic material a genetic 
construction encoding for the expression of a glycerol dehydratase enzyme and an 
aldehyde dehydrogenase enzyme, such that the bacterial host is capable of converting 
glycerol to 3-hydn)xypropionic acid. 

S 9. The bacterial host of claim 8 wherein the glycerol dehydratase from 

Klebsiella pneumoniae. 

10. The bacterial host of claim 8 wherein the gene encoding the glycerol 
dehydratase is the dhoB gene from IGebsiella pneumoniae. 

. 11. The bacterial host of claim 8 wherein the aldehyde dehydrogenase is 
1 0 selected from the group consisting of SEQ ID N0:2, SEQ ID N0:4, SEQ ID N0:6 and 
SEQIDN0:8. 

1 2. The bacterial host of claim 8 wherein the gene for the aldehyde 
dehydrogenase is selected from the group consisting of ALDH4, ALD2, aldA and aldB. 
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SEQUENCE LISTING 

<110> Suthers, Patrick F 
Cameron, Douglas C. 

<120> Production of 3 -Hydroxypropionic Acid in Recombinamt Organisms 

5 <130> UW960296.96S17 

<140> 
<141> 

<160> 23 

<170> Patentin Ver. 2.1 

10 <210> 1 

<2ll> 1529 
<212> DNA 

<213> Saccharomycea cerevisiae 

<220> 
IS <221> CDS 

<222> (25).. (1509) 

<400> 1 

gtcgcggtac caaggaggta teat atg tea eac ett cct atg aca gtg cct 51 
Met Ser His Leu Pro Met Tbr Val Pro 
20 15 

ate aag ctg cce aat ggg ttg gaa tat gag eaa cca acg ggg ttg ttc 99 
Xle Lys Leu Pro Asn Oly Leu GIu Tyr Glu Gin Pro Thr Gly Leu Phe 
10 IS 20 25 



ate aac aac aag ttt gtt eet tet aaa eag aae aag aee ttc gaa gtc 
25 lie Asn Asn Lys Phe Val Pro Ser Lys Gin Asn Lys Thr Phe Glu Val 
30 35 40 
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att aac cct tec acg gaa gaa gaa ata tgt cat att tat gaa ggt aga 
lie Asn Pro Ser Thr 6lu 61u Glu lie Cys His lie Tyr Olu Gly Arg 



gag gac gat gtg gaa gag gcc gtg cag gcc gcc gac cgt gcc ttc tct 
5 Glu Asp Asp Val Glu Glu Ala Val Gin Ala Ala Asp Arg Ala Phe Ser 



aat ggg tct tgg aac ggt ate gac cct att gac agg ggt eiag get ttg 
Asn Gly ser Trp Asn Gly lie Asp Pro lie Asp Arg Gly Lys Ala Leu 



10 tac agg tta gcc gaa tta att gaa cag gac aag gat gtc att get tec 
Tyr Arg Leu Ala Glu Leu lie Glu Gin Asp Lys Asp Val lie Ala Ser 
90 95 100 105 

ate gag act ttg gat aac ggt aaa get ate tct tec teg aga gga gat 
He Glu Thr Leu Asp Asn Gly Lys Ala He Ser Ser Ser Arg Gly Asp 
15 110 115 120 

gtt gat tta gtc ate sac tat ttg aaa tct tct get ggc ttt get gat 
Val Asp Leu Val He Asn Tyr Leu Lys Ser Ser Ala Gly Phe Ala Asp 
125 130 135 

aaa att gat ggt aga atg att gat act ggt aga acc cat ttt tct tae 
20 Lys He Asp Gly Arg Met He Asp Thr Gly Arg Thr His Phe Ser Tyr 
140 145 150 

act aag aga eag cct ttg ggt gtt tgt ggg cag att att cct tgg aat 
Thr Lys Arg Gin Pro Leu Gly Val Cys Gly Gin He He Pro Trp Asn 
155 160 165 

25 ttc cea ctg ttg atg tgg gee tgg aag att gee cct get ttg gtc acc 
Phe Pro Leu Leu Met Trp Ala Trp Lys He Ala Pro Ala Leu Val Thr 
170 175 180 185 
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ggt aac acc gtc gtg ttg aag act gcc gaa tec acc cca ttg tec get 
Gly Asn Thr Val Val Leu Lys Thr Ala Qlu Ser Thr Pro Leu Ser Ala 
190 195 200 

ttg tat gtg tot aaa tac ate cca cag gcg ggt att cca cct ggt gtg 
5 Leu Tyr Val Ser Lys Tyr He Pro Gin Ala Gly He Pro Pro Gly Val 
205 210 215 

ate aac att gta tec ggg ttt ggt aag att gtg gtt gag gee att aca 
He Asn He Val Ser Gly Phe Gly Lys He Val Val Glu Ala He Thr 
220 225 230 

10 aac cat cca aaa. ate aaa aag gtt gcc ttc aca ggg tec acg get acg 
Asn His Pro Lys He Lys Lys Val Ala Phe Thr Gly Ser Thr Ala Thr 
235 240 245 

ggt aga cac att tac cag tec gca gcc gca ggc ttg aaa aaa gtg act 
Gly Arg His He Tyr Gin Ser Ala Ala Ala Gly Leu Lys Lys Val Thr 
IS 250 255 260 265 

ttg gag ctg ggt ggt aaa tea cca aac att gtc ttc gcg gac gcc gag 
Leu Glu Leu Gly Gly Lys Ser Pro Asn He Val Phe Ala Asp Ala Glu 
270 275 280 

ttg aaa aaa gcc gtg caa aac att ate ctt ggt ate tac tac aat tet 
20 Leu Lys Lys Ala Val Gin Asn He He Leu Gly He Tyr Tyr Asn Ser 
285 290 295 

ggt gag gtc tgt tgt gcg ggt tea agg gtg tat gtt gaa gaa tct att 
Gly Glu Val Cys Cys Ala Gly Ser Arg Val Tyr Val Glu Glu Ser He 
300 305 310 

25 tac gac aaa ttc att gaa gag ttc aaa gee get tct gaa tee ate aag 
Tyr Asp Lys Phe He Glu Glu Phe Lys Ala Ala Ser Glu Ser He Lys 
315 320 325 



wo 01/16346 



PCTAJSOO/23878 



gtg ggc gac cca ttc gat gaa tct act ttc caa ggt gca caa acc tct 
Val Gly Asp Pro Phe Asp Glu Ser Thr Phe Gin Gly Ala Gin Thr Ser 
330 335 340 345 

caa atg caa eta aac aaa ate ttg aaa tac gtt gac att ggt aag aat 
5 Gin Met Gin Leu Asn Lys lie Leu Lys Tyr Val Asp lie Gly Lys Asn 
350 355 360 

gaa ggt get act ttg att acc ggt ggt gaa aga tta ggt age aag ggt 
Glu Gly Ala Thr Leu lie Thr Gly Gly Glu Arg Leu Gly Ser Lys Gly 
365 370 375 

10 tac ttc att aag cca act gtc ttt ggt gac gtt aag gaa gac atg aga 
Tyr Phe lie Lys Pro Thr Val Phe Gly Asp Val Lys Glu Asp Met Arg 
380 385 390 

att gtc aaa gag gaa ate ttt ggc cct gtt gtc act gta acc aaa ttc 
He Val Lys Glu Glu He Phe Gly Pro Val Val Thr Val Thr Lys Phe 
IS 395 400 405 

aaa tct gee gac gaa gtc att aac atg geg aac gat tct gaa tac ggg 
Lys Ser Ala Asp Glu Val He Asn Met Ala Asn Asp Ser Glu Tyr Gly 
410 415 420 425 

ttg get get ggt att cac acc tct aat att aat ace gee tta aaa gtg 
20 Leu Ala Ala Gly He His Thr Ser Asn He Asn Thr Ala Leu Lys Val 
430 435 440 

get gat aga gtt aat geg ggt acg gtc tgg ata aac act tat aac gat 
Ala Asp Arg Val Asn Ala Gly Thr Val Trp He Asn Thr Tyr Asn Asp 
445 450 455 

25 ttc cac cac gca gtt cct ttc ggt ggg ttc aat gca tct ggt ttg ggc 
Phe His His Ala Val Pro Phe Gly Gly Phe Asn Ala Ser Gly Leu Gly 
460 465 470 
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agg gaa atg tct gtt gat get tta caa aac tac ttg caa gtt aaa gcg 1491 
Arg Glu Met Ser Val Asp Ala Leu Gin Asn Tyr Leu Gin Val Lys Ala 
475 480 485 

gte cgt gcc aaa ttg gac gagtaagage tcgaattcgc 1529 
5 Val Arg Ala Lys Leu Asp 
490 495 



<210> 2 
<211> 495 
<212> PRT 
10 <213> Saccharomyces cereviaiae 

<400> 2 

Met Ser His Leu Pro Met Thr Val Pro lie Lys Leu Pro Asn Gly Leu 
15 10 15 

Glu Tyr Glu Gin Pro Thr Gly Leu Phe lie Asn Asn Lys Phe Val Pro 
15 20 25 30 

Ser Lys Gin Asn Lys Thr Phe Glu Val lie Asn Pro Ser Thr Glu Glu 
35 40 45 

Glu lie Cys His He Tyr Glu Gly Arg Glu Asp Asp Val Glu Glu Ala 
50 55 60 

20 Val Gin Ala Ala Asp Arg Ala Phe Ser Asn Gly Ser Trp Asn Gly He 

65 70 75 80 

Asp Pro He Asp Arg Gly Lys Ala Leu Tyr Arg Leu Ala Glu Leu He 
85 90 95 

Glu Gin Asp Lys Asp Val He Ala Ser He Glu Thr Leu Asp Asn Gly 
25 100 105 110 

Lys Ala He Ser Ser Ser Arg Qly Asp Val Asp Leu Val He Asn Tyr 
115 120 125 
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Leu Lys Ser Ser Ala 61y Phe Ala Asp Lys lie Asp Gly Arg Met lie 
130 135 140 

Asp Thr Gly Axg Thr His Phe Ser Tyr Thr Lys Arg Gin Pro Leu Gly 
145 150 155 160 

S Val Cys Gly Gin lie lie Pro Trp Asn Phe Pro Leu Leu Met Trp Ala 
165 170 175 

Trp Lys lie Ala Pro Ala Leu Val Thr Gly Asn Thr Val Val Leu Lys 
180 185 190 

Thr Ala Glu Ser Thr Pro Leu Ser Ala Leu Tyr Val Ser Lys Tyr lie 
10 195 200 205 

Pro Gin Ala Gly lie Pro Pro Gly Val lie Asn lie Val Ser Gly Phe 
210 215 220 

Gly Lys lie Val Val Glu Ala He Thr Asn His Pro Lys He Lys Lys 
225 230 235 240 

15 Val Ala Phe Thr Gly Ser Thr Ala Thr Gly Arg His He Tyr Gin Ser 
245 250 255 

Ala Ala Ala Gly Leu Lys Lys Val Thr Leu Glu Leu Gly Gly Lys Ser 
260 265 270 

Pro Asn He Val Phe Ala Asp Ala Glu Leu Lys Lys Ala Val Gin Asn 
20 275 280 285 

He He Leu Gly He Tyr Tyr Asn Ser Gly Glu Val Cys Cys Ala Gly 
290 295 300 

Ser Arg Val Tyr Val Glu Glu Ser He Tyr Asp Lys Phe He Glu Glu 
305 310 315 320 

25 Phe Lys Ala Ala Ser Glu Ser He Lys Val Gly Asp Pro Phe Asp Glu 
325 330 335 
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Ser Thr Phe Gin 61y Ala Gin Thr Ser Gin Met Gin Leu Asn Lys lie 
340 345 350 

Leu Lys Tyr Val Asp lie 61y Lys Asn Glu Gly Ala Thx Leu lie Thr 
355 360 365 

5 Gly Gly Glu Arg Leu Qly Ser Lys Gly Tyr Phe lie Lys Pro Thr Val 
370 375 380 

Phe Gly Asp Val Lys Glu Asp Met Arg lie Val Lys Glu Glu He Phe 
385 390 395 400 

Gly Pro Val Val Thr Val Thr Lys Phe Lys Ser Ala Asp Glu Val He 
10 405 410 415 

Asn Met Ala Asn Asp Ser Glu Tyr Gly Leu Ala Ala Gly He His Thr 
420 425 430 

Ser Asn He Asn Thr Ala Leu Lys Val Ala Asp Arg Val Asn Ala Gly 
435 440 445 

IS Thr Val Trp He Asn Thr Tyr Asn Asp Phe His His Ala Val Pro Phe 
450 455 460 

Gly Gly Phe Asn Ala Ser Gly Leu Gly Arg Glu Met Ser Val Asp Ala 
465 470 475 480 

Leu Gin Asn Tyr Leu Gin Val Lys Ala Val Arg Ala Lys Leu Asp 
20 485 490 495 



<210> 3 

<211> 1541 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> CDS 

<222> (22) . . (1521) 

<400> 3 

5 gcggtaccaa ggagatatca t atg tea gcc gcc gcc acc cag gcc gtg cct 

Met Ser Ala Ala Ala Thr Gin Ala Val Pro 

15 10 

gcc ccc aac cag cag ccc gag gtc ttc tgc aac cag att ttc ata aac 
Ala Pro Asn Gin Gin Pro Glu Val Phe Cys Asn Gin lie Phe lie Asn 
10 15 20 25 

aat gaa tgg cac gat gcc gtc age agg aaa aca ttc ccc acc gtc aat 
Asn Glu Trp His Asp Ala Val Ser Arg Lys Thr Phe Pro Thr Val Asn 



ccg tec act gga gag gtc ate tgt cag gta get gaa ggg gac aag gaa 
IS Pro Ser Thr Gly Glu Val lie Cys Gin Val Ala Glu Gly Asp Lys Glu 



gat gtg gac aag gca cgt gaa ggc cgc ccg ggc gcc ttc cag ctg ggc 
Asp Val Asp Lys Ala Arg Glu Gly Arg Pro Gly Ala Phe Gin Leu Gly 



20 tea cct tgg cgc cgc atg gac gca tea cac age ggc egg ctg ctg aac 
Ser Pro Trp Arg Arg Met Asp Ala Ser His Ser Gly Arg Ijeu Leu Asn 



cgc ctg gcc gat ctg ate gag egg gac egg acc tac ctg gcg gcc ttg 
Arg I<eu Ala Asp lieu lie Glu Arg Asp Arg Thr Tyr Leu Ala Ala Leu 
25 95 100 105 

gag acc ctg gac aat ggc aag ccc tat gtc ate tec tac ctg gtg gat 
Glu Thr Leu Asp Asn Gly Lys Pro Tyr Val lie Ser Tyr Leu Val Asp 
110 115 120 
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ttg gac atg gtc etc aaa tgt etc egg tat tat gcc ggc tgg get gat 
Leu Asp Met Val Leu Lys Cys Leu Arg Tyr Tyr Ala Oly Trp Ala Asp 
125 130 135 

aag tae eac ggg aaa ace ate cec att gac gga gac ttc ttc age tac 
5 Lys Tyr His 61y Lys Thr lie Pro lie Asp Gly Asp Phe Phe Ser Tyr 
140 145 ISO 

aca cge cat gaa cct gtg ggg gtg tgc ggg eag ate att ccg tgg aat 
Tbr Arg His Glu Pro Val Gly Val Cys Oly Gin He He Pro Trp Asn 
155 160 165 170 

10 ttc ccg etc ctg atg caa gca tgg aag ctg ggc cca gcc ttg gca act 
Phe Pro Leu Leu Met Gin Ala Trp Lys Leu Gly Pro Ala Leu Ala Thr 
175 180 IBS 

gga aac gtg gtt gtg atg aag gta get gag cag aca ccc etc acc gcc 
Gly Asn Val Val Val Met Lys Val Ala Glu Gin Thr Pro Leu Thr Ala 
IS 190 195 200 

etc tat gtg gcc aac ctg ate aag gag get ggc ttt cec cct ggt gtg 
Leu Tyr Val Ala Asn Leu He Lys Glu Ala Gly Phe Pro Pro Gly Val 
205 210 215 

gtc aac att gtg cct gga ttt ggc cec acg get ggg gee gcc att gee 
20 Val Asn He Val Pro Gly Phe Oly Pro Thr Ala Gly Ala Ala He Ala 
220 225 230 

tec cat gag gat gtg gac aaa gtg gca ttc aca ggc tec act gag att 
Ser His Glu Asp Val Asp Lys Val Ala Phe Thr Gly Ser Thr Glu He 
235 240 245 250 

25 ggc cgc gta ate cag gtt get get ggg age age aac etc aag aga gtg 
Gly Arg Val He Gin Val Ala Ala Gly Ser Ser Asn Leu Lys Arg Val 
255 260 265 
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acc ttg gag ctg ggg ggg aag age ccc aac ate ate atg tea gat gee 
Thr Leu 6lu Leu 61y Gly Lys Ser Pro Asn He lie Met Ser Asp Ala 
270 275 280 

gat atg gat tgg gcc gtg gaa cag gee cac ttc gcc ctg ttc ttc aac 
5 Asp Met Asp Trp Ala Val 61u Gin Ala His Phe Ala Leu Phe Phe Asn 
285 290 295 

cag ggc cag tgc tgc tgt gcc ggc tec egg acc ttc gtg cag gag gac 
Oln Gly Gin Cya Cys Cys Ala Gly Ser Arg Thr Phe Val Gin Glu Asp 
300 305 310 

10 ate tat gat gag ttt gtg gtg egg ago gtt gee egg gee aag tct egg 
He Tyr Asp Glu Phe Val Val Arg Ser Val Ala Arg Ala Lys Ser Arg 
315 320 325 330 

gtg gtc ggg aac ece ttt gat age aag aee gag cag ggg ecg cag gtg 
Val Val Gly Asn Pro Phe Asp Ser Lys Thr Glu Gin Gly Pro Gin Val 
15 335 340 345 

gat gaa act cag ttt aag aag ate etc ggc tac ate aac aog ggg aag 
Asp Glu Thr Gin Phe Lys Lys He Leu Gly Tyr He Asn Thr Gly Lys 
350 355 360 

caa gag ggg geg aag ctg ctg tgt ggt ggg ggc att get get gac egt 
20 Gin Glu Gly Ala Lys Leu Leu Cys Gly Gly Gly He Ala Ala Asp Arg 
365 370 375 

ggt tac ttc ate cag ccc act gtg ttt gga gat gtg cag gat ggc atg 
Gly Tyr Phe He Gin Pro Thr Val Phe Gly Asp Val Gin Asp Gly Met 
380 385 390 

25 ace ate gcc aag gag gag ate ttc ggg cea gtg atg cag ate ctg aag 
Thr He Ala Lys Glu Glu He Phe Gly Pro Val Met Gin He Leu Lys 
395 400 405 410 
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ttc aag acc ata gag gag gtt gtt ggg aga gcc aac aat tec acg tac 1299 
Phe Lys Thr He Glu Glu Val Val Gly Arg Ala Asn Asn Ser Thr Tyr 
415 420 425 

ggg ctg gcc gca get gtc ttc aca aag gat ttg gac aag gcc aat tac 1347 
S Gly Leu Ala Ala Ala Val Phe Thr Lys Asp Leu Asp Lys Ala Asn Tyr 
430 435 440 

ctg tec cag gcc etc eag geg ggc act gtg tgg gtc aac tgc tat gat 1395 
Leu Ser Gin Ala Leu Gin Ala Gly Thr Val Trp Val Asn Cys Tyr Asp 
445 450 455 

10 gtg ttt gga gee cag tea ecc ttt ggt ggc tac aag atg teg ggg agt 1443 
Val Phe Gly Ala Gin Ser Pro Phe Gly Gly Tyr Lys Met Ser Gly Ser 
460 465 470 

ggc egg gag ttg ggc gag tac ggg ctg cag gca tac act gaa gtg aaa 1491 
Gly Arg Glu Leu Gly Glu Tyr Gly Leu Gin Ala Tyr Thr Glu Val Lys 
15 475 480 485 490 

act gtc aca gtc aaa gtg cet cag aag aac teataagagc tegaattcgc 1541 
Thr Val Thr Val Lys Val Pro Gin Lys Asn 
495 500 



<210> 4 

20 <211> 500 

<212> PRT 

<213> Homo sapiens 

<400> 4 

Met Ser Ala Ala Ala Thr Gin Ala Val Pro Ala Pro Asn Gin Gin Pro 

25 1 5 10 15 

Glu Val Phe Cya Asn Gin He Phe He Asn Asn Glu Trp His Asp Ala 
20 25 30 
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Val Ser Arg Lys Thr Phe Pro Thr Val Asn Pro Ser Thr Gly Glu Val 
35 40 45 

lie Cys Gin Val Ala Glu Gly Asp Lys Glu Asp Val Asp Lys Ala Arg 
50 55 60 

S Glu Gly Arg Pro Gly Ala Phe Gin Leu Gly Ser Pro Trp Arg Arg Met 
65 70 75 80 

Asp Ala Ser His Ser Gly Arg Leu Leu Asn Arg Leu Ala Asp Leu lie 
85 90 95 

Glu Arg Asp Arg Thr Tyr Leu Ala Ala Leu Glu Thr Leu Asp Asn Gly 

10 100 105 110 

Lys Pro Tyr Val lie Ser Tyr lieu Val Asp Leu Asp Met Val Leu Lys 
115 120 125 

Cys Leu Arg Tyr Tyr Ala Gly Trp Ala Asp Lys Tyr His Gly Lys Thr 
130 135 140 

15 lie Pro lie Asp Gly Asp Phe Phe Ser Tyr Thr Arg His Glu Pro Val 
145 150 155 160 

Gly Val Cys Gly Gin lie lie Pro Trp Asn Phe Pro Leu Leu Met Gin 
165 170 175 

Ala Trp Lys Leu Oly Pro Ala Leu Ala Thr Gly Asn Val Val Val Met 
20 180 185 190 

Lys Val Ala Glu Gin Thr Pro Leu Thr Ala Leu Tyr Val Ala Asn Leu 
195 200 205 

lie Lys Glu Ala Gly Phe Fro Fro Gly Val Val Asn He Val Pro Gly 
210 215 220 
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Phe Gly Pro Thr Ala Gly Ala Ala lie Ala Ser His Glu Asp Val Asp 
225 230 235 240 

Lys Val Ala Phe Thr Gly Ser Thr Glu He Gly Arg Val He Gin Val 
245 250 255 

5 Ala Ala Gly Ser Ser Asn Leu Lys Arg Val Thr Leu Glu Leu Gly Gly 
260 265 270 

Lys Ser Pro Asn He He Met Ser Asp Ala Asp Met Asp Trp Ala Val 
275 280 285 

Glu Gin Ala His Phe Ala Leu Phe Phe Asn Gin Gly Gin Cys Cys Cys 
10 290 295 300 

Ala Gly Ser Arg Thr Phe Val Gin Glu Asp He Tyr Asp Glu Phe Val 
305 310 315 320 

Val Arg Ser Val Ala Arg Ala Lys Ser Arg Val Val Gly Asn Pro Phe 
325 330 335 

IS Asp Ser Lys Thr Glu Gin Gly Pro Gin Val Asp Glu Thr Gin Phe Lys 
340 345 350 

Lys He Leu Gly Tyr He Asn Thr Gly Lys Gin Glu Gly Ala Lys Leu 
355 360 365 

Leu Cys Gly Gly Gly He Ala Ala Asp Arg Gly Tyr Phe He Gin Pro 
20 370 375 380 

Thr Val Phe Gly Asp Val Gin Asp Gly Met Thr He Ala Lys Glu Glu 
385 390 395 400 

He Phe Gly Pro Val Met Gin He Leu Lys Phe Lys Thr He Glu Glu 
405 410 415 
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Val Val Gly Arg Ala Asn Asn Ser Thr Tyr Qly Ijeu Ala Ala Ala Val 
420 425 430 

Phe Thr hya Asp Leu Asp Lys Ala Asn Tyr Leu Ser Gin Ala Leu Gin 
435 440 445 

5 Ala Gly Thr Val Trp Val Asn Cys Tyr Asp Val Phe Gly Ala Gin Ser 
450 455 460 

Pro Phe Gly Gly Tyr Lys Met Ser Gly Ser Gly Arg Glu Leu Gly Glu 
465 470 475 480 

Tyr Gly Leu Gin Ala Tyr Thr Glu Val Lys Thr Val Thr Val Lys Val 
10 485 490 495 

Pro Gin Lys Asn 
500 



<210> 5 
<211> 1512 
15 <212> DNA 

c213> Escherichia coll 

<220> 

c221> CDS 

<222> (37) . . (1473) 

20 <400> 5 

gctaccatgg cttaaccggt accaaggaga tatcat atg tea gta ccc gtt caa 

Met Ser Val Pro Val Gin 



cat cct atg tat ate gat gga cag ttt gtt acc tgg cgt gga gac gca 102 
25 His Pro Met Tyr lie Asp Gly Gin Phe Val Thr Trp Arg Gly Asp Ala 
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egg att gat gtg gta aac cct get aca gag get gtc att tec cge ata 
Trp lie Asp Val Val Asn Pro Ala Thr Glu Ala Val He Ser Arg He 



ccc gat ggt cag gcc gag gat gcc cgt aag gca ate gat gca gca gaa 
5 Pro Asp Gly Gin Ala Glu Asp Ala Arg Lys Ala He Asp Ala Ala Glu 



cgt gca caa cca gaa tgg gaa gcg ttg cct get att gaa cge gcc agt 

Arg Ala Gin Pro Glu Trp Glu Ala I^eu Pro Ala He Glu Arg Ala Ser 

SS 60 65 70 

10 tgg ttg cge aaa ate tec gcc ggg ate cge gaa cge gee agt gaa ate 

Trp Leu Arg Lys He Ser Ala Gly He Arg Glu Arg Ala Ser Glu He 



agt gcg ctg att gtt gaa gaa ggg ggc aag ate cag cag ctg get gaa 
Ser Ala Leu He Val Glu Glu Gly Gly Lys He Gin Gin Leu Ala Glu 
IS 90 95 100 

gtc gaa gtg get ttt act gcc gae tat ate gat tac atg gcg gag tgg 
Val Glu Val Ala Phe Thr Ala Asp Tyr He Asp Tyr Met Ala Glu Trp 
105 110 115 

gca egg cgt tac gag ggc gag att att caa age gat cgt cea gga gaa 
20 Ala Arg Arg Tyr Glu Gly Glu He He Gin Ser Asp Arg Pro Gly Glu 
120 125 130 

aat att ctt ttg ttt aaa cgt gcg ctt ggt gtg act ace ggc att ctg 
Asn He Leu lieu Phe Lys Arg Ala Leu Gly Val Thr Thr Gly He Leu 
135 140 145 150 

25 ccg tgg aac ttc ccg ttc ttc etc att gcc cge aaa atg get ccc get 
Pro Trp Asn Phe Pro Phe Phe Leu He Ala Arg Lys Met Ala Pro Ala 
155 160 165 
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ctt ttg acc ggt aat acc ate gtc att aaa cct agt gaa ttt acg aca 582 
Leu Leu Thr Gly Asn Thr lie Val lie Lys Pro Ser 61u Phe Thr Thr 
170 175 180 

aac aat gcg att gca ttc gcc aaa ate gtc gat gaa ata ggc ctt ccg 630 
5 Asn Asn Ala lie Ala Phe Ala Lys He Val Asp Glu He Gly Leu Pro 
185 190 195 

cgc ggc gtg ttt aac ctt gta ctg ggg cgt ggt gaa acc gtt ggg caa 676 
Arg Gly Val Phe Asn Leu Val Leu Gly Arg Gly Glu Thr Val Gly Gin 
200 205 210 

10 gaa ctg gcg ggt aac cca aag gtc gca atg gtc agt atg aca ggc age 726 
Glu Leu Ala Gly Asn Pro Lys Val Ala Met Val Ser Met Thr Gly Ser 
215 220 225 230 

gtc tct gca ggt gag aag ate atg gcg act gcg gcg aaa aac ate acc 774 
Val Ser Ala Gly Glu Lys He Met Ala Thr Ala Ala Lys Asn Zle Thr 
IS 235 240 245 

aaa gtg tgt ctg gaa ttg ggg ggt aaa gca cca get ate gta atg gac 822 
Lys Val Cys Leu Glu Leu Gly Gly Lys Ala Pro Ala Zle Val Met Asp 
250 255 260 

gat gee gat ctt gaa ctg gca gtc aaa gcc ate gtt gat tea cgc gtc 870 
20 Asp Ala Asp Leu Glu Leu Ala Val Lys Ala He Val Asp Ser Arg Val 
265 270 275 

att aat agt ggg caa gtg tgt aac tgt gca gaa cgt gtt tat gta cag 918 
Zle Asn Ser Gly Gin Val Cys Asn Cys Ala Glu Arg Val Tyr Val Gin 
280 285 290 

25 aaa ggc att tat gat cag ttc gtc aat egg ctg ggt gaa gcg atg cag 966 
Lys Gly Zle Tyr Asp Gin Phe Val Asn Arg Leu Gly Glu Ala Met Gin 
295 300 305 310 
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gcg gtt caa ttt ggt aac ccc get gaa cgc aac gac att gcg atg ggg 
Ala Val 6Ln Phe 61y Asn Pro Ala Glu Arg Asn Asp lie Ala Met Gly 
315 320 325 

ccg ttg att aac gee gcg gcg ctg gaa agg gtc gag caa aaa gtg gcg 
5 Pro Leu lie Asn Ala Ala Ala Leu Glu Arg Val Glu Gin Lys Val Ala 
330 335 340 

cgc gca gta gaa gaa ggg gcg aga gtg gcg ttc ggt ggc aaa gcg gta 
Arg Ala Val Glu Glu Gly Ala Arg Val Ala Phe Gly Gly Lys Ala Val 
345 350 355 

10 gag ggg aaa gga tat tat tat ccg ccg aca ttg ctg ctg gat gtt cgc 
Glu Gly Lys Gly Tyr Tyr Tyr Pro Pro Thr Leu Leu Leu Asp Val Arg 
360 365 370 

cag gaa atg teg att atg cat gag gaa acc ttt ggc ccg gtg ctg cca 
Gin Glu Met Ser lie Met His Glu Glu Thr Phe Gly Pro Val Leu Pro 
IS 375 380 385 390 

gtt gtc gca ttt gac acg ctg gaa gat get ate tea atg get aat gac 
Val val Ala Phe Asp Thr Leu Glu Asp Ala lie Ser Met Ala Asn Asp 
395 400 405 

agt gat tac ggc ctg acc tea tea ate tat acc caa aat ctg aac gtc 
20 Ser Asp Tyr Gly Leu Thr Ser Ser lie Tyr Thr Gin Asn Leu Asn Val 
410 4X5 420 

gcg atg aaa gee att aaa ggg ctg aag ttt ggt gaa act tac ate aac 
Ala Met Lys Ala He Lys Gly Leu Lys Phe Gly Glu Thr Tyr He Asn 
425 430 435 

25 cgt gaa aac ttc gaa get atg caa ggc ttc cac gee gga tgg cgt aaa 
Arg Glu Asn Phe Glu Ala Met Gin Gly Phe His Ala Gly Trp Arg Lys 
440 445 450 
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tec ggt att ggc ggc gca gat ggt aaa cat ggc ttg cat gga tat ctg 1446 
Ser 61y He Gly Gly Ala Asp Gly Lys His Oly Leu His Gly Tyr Lau 
455 460 465 470 

cag acc cag gtg gtt tat tta cag tot taagagctcg aattcccgtc 1493 
5 Gin Tbr Gin Val Val Tyr Leu Gin Ser 
475 

gacggctcta gactcgagcg 1513 



<210> 6 
<211> 479 
10 <212> PRT 

<213> Escherichia coli 

<400> 6 

Met Ser Val Pro Val Gin His Pro Met Tyr He Asp Gly Gin Phe Val 
15 10 15 

IS Thr Trp Arg Gly Asp Ala Trp He Asp Val Val Asn Pro Ala Thr Glu 
20 25 30 

Ala Val He Ser Arg He Pro Asp Gly Gin Ala Glu Asp Ala Arg Lys 
35 40 45 

Ala He Asp Ala Ala Glu Arg Ala Gin Pro Glu Trp Glu Ala Leu Pro 
20 50 55 60 

Ala He Glu Arg Ala Ser Trp Leu Arg Lys He Ser Ala Oly He Arg 

65 70 75 80 

Glu Arg Ala Ser Glu He Ser Ala Leu He Val Glu Glu Oly Gly Lys 
85 90 95 

25 He Gin Qln Leu Ala Glu Val Glu Val Ala Phe Thr Ala Asp Tyr He 
100 105 110 
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Asp Tyr Met Ala Olu Trp Ala Arg Arg Tyr Glu Gly Glu lie lie Gin 
lis 120 125 

Ser Asp Arg Pro Gly Glu Asn lie Leu Leu Phe Lys Arg Ala Leu Gly 
130 135 140 

S Val Thr Thr Gly lie Leu Pro Trp Asn Phe Pro Phe Phe Leu lie Ala 
145 150 155 160 

Arg Lys Met Ala Pro Ala Leu Leu Thr Gly Asn Thr lie Val lie Lys 
165 170 175 

Pro Ser Glu Phe Thr Thr Asn Asn Ala lie Ala Phe Ala Lys He Val 
10 180 185 190 

Asp Glu He Gly Leu Pro Arg Gly Val Phe Asn Leu Val Leu Gly Arg 
195 200 205 

Gly Glu Thr Val Gly Gin Glu Leu Ala Gly Asn Pro Lys Val Ala Met 
210 215 220 

IS Val Ser Met Thr Gly Ser Val Ser Ala Gly Glu Lys He Met Ala Thr 
225 230 235 240 

Ala Ala Lys Asn He Thr Lys Val Cys Leu Glu Ijeu Gly Gly Lys Ala 
245 250 255 

Pro Ala He Val Met Asp Asp Ala Asp Leu Glu Leu Ala Val Lys Ala 
20 260 265 270 

He Val Asp Ser Arg Val He Asn Ser Gly Gin Val Cys Asn Cys Ala 
275 280 285 

Glu Arg Val Tyr Val Gin Lys Gly He Tyr Asp Gin Phe Val Asn Arg 
290 295 300 
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Leu Gly Glu Ala Met Gin Ala Val Gin Phe Qly Asn Pro Ala Glu Arg 
305 310 315 320 

Asn Asp He Ala Met Gly Pro Leu He Asn Ala Ala Ala Leu Glu Arg 
325 330 335 

5 Val Glu Gin Lys Val Ala Arg Ala Val Glu Glu Gly Ala Arg Val Ala 
340 345 350 

Phe Gly Gly Lys Ala Val Glu Gly Lys Gly Tyr Tyr Tyr Pro Pro Tlur 
355 360 365 

Leu Leu Leu Asp Val Arg Qln Glu Met Ser He Met His Glu Glu Thr 
10 370 375 380 

Phe Oly Pro Val Leu Pro Val Val Ala Phe Asp Thr Leu Glu Asp Ala 
385 390 395 400 

He Ser Met Ala Asn Asp Ser Asp Tyr Gly Leu Thr Ser Ser He Tyr 
405 410 415 

IS Thr Gin Asn Leu Asn Val Ala Met Lys Ala He Lys Gly Leu Lys Phe 
420 425 430 

Gly Glu Thr Tyr He Asn Arg Glu Asn Phe Glu Ala Met Qln Gly Phe 
435 440 445 

His Ala Gly Trp Arg Lys Ser Gly He Gly Qly Ala Asp Gly Lys His 
20 450 455 460 

Gly Leu His Gly Tyr Leu Gin Thr Gin Val Val Tyr Leu Gin Ser 
465 470 475 
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<2X0> 7 
<211> 1574 
<212> DMA 

<213> Escherichia coli 

5 <220> 

<221> CDS 

<222> (22).. (1557) 

<400> 7 

gcggtaccaa ggaggtatca t: atg acc aat aat ccc cct tea gca cag att 

10 Met Thr Asn Asn Pro Pro Ser Ala Gin He 

15 10 

aag ccc ggc gag tat ggt ttc ccc etc aag tta aaa gcc cgc tat gac 
Lys Pro Gly Glu Tyr Gly Phe Pro Leu Lys Leu Lys Ala Arg Tyr Asp 



15 aac ttt att ggc ggc gaa tgg gta gcc cct gcc gac ggc gag tat tac 
Asn Phe He Gly Gly Glu Trp Val Ala Pro Ala Asp Gly Glu Tyr Tyr 



cag aat ctg acg ccg gtg acc ggg cag ctg ctg tgc gaa gtg gcg tct 195 
Gin Asn Leu Thr Pro Val Thr Gly Gin Leu Leu Cys Glu Val Ala Ser 
20 45 50 55 

teg ggc aaa cga gac ate gat ctg gcg ctg gat get geg eac aaa gtg 243 
Ser Gly Lys Arg Asp He Asp Leu Ala Leu Asp Ala Ala His Lys Val 
60 65 70 

aaa gat aaa tgg geg eae ace teg gtg cag gat egt geg gcg att ctg 291 
25 Lys Asp Lys Trp Ala His Thr Ser Val Gin Asp Arg Ala Ala He Leu 
75 80 85 90 

ttt aag att gcc gat cga atg gaa caa aac etc gag ctg tta gcg aca 339 
Phe Lys He Ala Asp Arg Met Glu Gin Asn Leu Glu Leu Leu Ala Thr 
95 100 105 
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get gaa acc tgg gat aac ggo aaa ccc att cgc gaa acc agt get geg 
Ala 6lu Thr Trp Asp Asn 61y Lys Pro He Arg Glu Thr Ser Ala Ala 
110 115 120 

gat gta ccg ctg geg att gac cat tte cge tat ttc gee teg tgt att 
5 Asp Val Pro Leu Ala He Asp His Phe Arg Tyr Phe Ala Ser Cys He 
125 130 135 

egg geg eag gaa ggt ggg ate agt gaa gtt gat age gaa ace gtg gee 
Arg Ala Oln Olu 61y Gly He Ser Glu Val Asp Ser Glu Thr Val Ala 
140 145 150 

10 tat cat ttc cat gaa ccg tta ggc gtg gtg ggg eag att ate ccg tgg 
Tyr His Phe His Glu Pro Leu Gly Val Val Gly Gin He He Pro Trp 
155 160 165 170 

aae tte eeg ctg ctg atg geg age tgg aaa atg get eee geg etg geg 
Asn Phe Pro Leu Leu Met Ala Ser Trp Lys Met Ala Pro Ala Leu Ala 
15 175 180 185 

geg gge aae tgt gtg gtg etg aaa eee gea egt ett aee ccg ctt tet 
Ala Gly Asn Cys Val Val Leu Lys Pro Ala Arg Leu Thr Pro Leu Ser 
190 195 200 

gta etg etg eta atg gaa att gte ggt gat tta etg eeg eeg gge gtg 
20 Val Leu Leu Leu Met Glu He Val Gly Asp Leu Leu Pro Pro Gly Val 
205 210 215 

gtg aae gtg gte aat gge gca ggt ggg gta att gge gaa tat etg geg 
Val Asn Val Val Asn Gly Ala Gly Gly Val He Gly Glu Tyr Leu Ala 
220 225 230 

25 acc teg aaa cgc ate gee aaa gtg geg ttt acc ggc tea acg gaa gtg 
Thr Ser Lys Arg He Ala Lys Val Ala Phe Thr Gly Ser Thr Glu Val 
235 240 245 250 
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ggc caa caa att atg caa tac gca acg caa aac att att ccg gtg acg 
Qly Gin Gin He Met Gin Tyr Ala Thr Gin Asn He He Pro Val Thr 
255 260 265 

ctg gag ttg ggc ggt aag teg eca aat ate gtc ttt get gat gtg atg 
5 Leu Glu Leu Gly 61y Lys Ser Pro Asn He Val Phe Ala Asp Val Met 
270 275 280 

gat gaa gaa gat gee ttt ttc gat aaa gcg ctg gaa ggc ttt gca ctg 
Asp Glu Glu Asp Ala Phe Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu 
285 290 295 

10 ttt gcc ttt aac cag ggc gaa gtt tgc acc tgt ccg agt cgt get tta 
Phe Ala Phe Asn Gin Gly Glu Val Cys Thr Cya Pro Ser Arg Ala Leu 
300 305 310 

gtg cag gaa tct ate tac gaa cgc ttt atg gaa cgc gee ate egc cgt 
Val Gin Glu Ser He Tyr Glu Arg Phe Met Glu Arg Ala He Arg Arg 
15 315 320 325 330 

gte gaa age att cgt age ggt aae ecg etc gac age gtg acg caa atg 
Val Glu Ser He Arg Ser Gly Asn Pro Leu Asp Ser Val Thr Gin Met 
335 340 345 

ggc gcg cag gtt tct cae ggg caa ctg gaa acc ate etc aac tac att 
20 Gly Ala Gin Val Ser His Gly Gin Leu Glu Thr He Leu Asn Tyr He 
350 355 360 

gat ate ggt aaa aaa gag ggc get gac gtg etc aca ggc ggg egg cgc 
Asp He Gly Lys Lys Glu Gly Ala Asp Val Leu Thr Gly Gly Arg Arg 
365 370 375 

25 aag ctg ctg gaa ggt gaa ctg aaa gac ggc tac tac etc gaa ccg acg 
Lys Leu Leu Glu Gly Glu Leu Lys Asp Gly Tyr Tyr Leu Glu Pro Thr 
380 385 390 
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att ctg ttt ggt cag aac aat atg egg gtg ttc cag gag gag att ttt 
lie Leu Phe 61y Oin Asn Asn Met Arg Val Phe Gin Glu Olu lie Phe 
395 400 405 410 

ggc ccg gCg ctg gcg gtg acc acc ttc aaa acg atg gaa gaa gcg ctg 
5 Gly Pro Val Leu Ala Val Thr Thr Phe Lys Tbr Met Glu Glu Ala Leu 
415 420 425 

gag ctg gcg aac gat acg caa tat ggc ctg ggc gcg ggc gtc tgg age 
Glu Leu Ala Asn Asp Thr Gin Tyr Gly Leu Gly Ala Gly Val Trp Ser 
430 435 440 

10 cgc aac ggt aat ctg gcc tat aag atg ggg cgc ggc ata cag get ggg 
Arg Asn Gly Asn X.eu Ala Tyr Lys Met Gly Arg Gly He Gin Ala Gly 
445 450 455 

cgc gtg tgg acc aac tgt tat cac get tac ccg gca cat gcg gcg ttt 
Arg Val Trp Thr Asn Cys Tyr His Ala Tyr Pro Ala His Ala Ala Phe 
IS 460 465 470 

ggt ggc tac aaa caa tea ggt ate ggt cgc gaa acc cac aag atg atg 
Gly Gly Tyr Lys Gin Ser Gly lie Gly Arg Glu Thr His Lys Met Net 
475 480 485 490 

ctg gag cat tac cag caa acc aag tgc ctg ctg gtg age tac teg gat 
20 Leu Glu His Tyr Gin Gin Thr Lys Cys Leu Leu Val Ser Tyr Ser Asp 
495 500 505 

aaa ccg ttg ggg ctg ttc taagagctcg aattege 
Lys Pro Leu Gly Leu Phe 
510 
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<210> 8 
<21X> 512 
<212> PRT 

<213> Escherichia coli 
5 <400> 8 

Met Thr Asn Asn Pro Pro Ser Ala Gin lie Lys Pro Gly Glu Tyr Qly 
15 10 15 

Phe Pro Leu Lys Leu Lys Ala Arg Tyr Asp Asn Phe He Gly Gly Glu 
20 25 30 

10 Trp Val Ala Pro Ala Asp Gly Glu Tyr Tyr Gin Asn Leu Thr Pro Val 
35 40 45 

Thr Gly Oln Leu Leu Cys Glu Val Ala Ser Ser Gly Lys Arg Asp He 
50 55 60 

Asp Leu Ala Leu Asp Ala Ala His Lys Val Lys Asp Lys Trp Ala His 
15 65 70 75 80 

Thr Ser Val Gin Asp Arg Ala Ala He Leu Phe Lys He Ala Asp Arg 
85 90 95 

Met Glu Gin Asn Leu Glu Leu Leu Ala Thr Ala Glu Thr Trp Asp Asn 
100 105 110 

20 Gly Lys Pro He Arg Glu Thr Ser Ala Ala Asp Val Pro Leu Ala He 

115 120 125 

Asp His Phe Arg Tyr Phe Ala Ser Cys He Arg Ala Gin Glu Gly Gly 
130 135 140 

He Ser Glu Val Asp Ser Glu Thr Val Ala Tyr His Phe His Glu Pro 
25 145 150 155 160 
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Leu Gly Val Val Gly Gin lie lie Pro Trp Asn Phe Pro Leu Leu Met 
X65 170 175 

Ala Ser Trp Lys Met Ala Pro Ala Leu Ala Ala Gly Asn Cya Val Val 
180 185 190 

S Leu Lys Pro Ala Arg Leu Thr Pro Leu Ser Val Leu Leu Leu Met Glu 
195 200 205 

He Val Gly Asp Leu Leu Pro Pro Gly Val Val Asn Val Val Asn Gly 
210 215 220 

Ala Gly Gly Val He Gly Olu Tyr Leu Ala Thr Ser Lys Arg He Ala 
10 225 230 235 240 

Lys Val Ala Phe Thr Gly Ser Thr Glu Val Gly Gin Gin He Met Gin 
245 250 255 

Tyr Ala Thr Gin Asn He He Pro Val Thr Leu Glu Leu Gly Gly Lys 
260 265 270 

IS Ser Pro Asn He Val Phe Ala Asp Val Met Asp Glu Glu Asp Ala Phe 
275 280 285 

Phe Asp Lys Ala Leu Glu Gly Phe Ala Leu Phe Ala Phe Asn Gin Gly 
290 295 300 

Glu Val Cys Thr Cys Pro Ser Arg Ala Leu Val Gin Glu Ser He Tyr 
20 305 310 315 320 

Glu Arg Phe Met Glu Arg Ala He Arg Arg Val Glu Ser He Arg Ser 
325 330 335 

Gly Asn Pro Leu Asp Ser Val Thr Gin Met Gly Ala Gin Val Ser His 
340 345 350 
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Gly Gin Leu Glu Thr lie Leu Asn Tyr lie Asp lie 61y Lya Lys Glu 
355 360 365 

Gly Ala Asp Val Leu Thr Oly Gly Arg Arg Lys Leu Leu Glu Oly Glu 
370 375 380 

S Leu Lys Asp Gly Tyr Tyr Leu Glu Pro Thr lie Leu Phe Gly Gin Asn 
385 390 395 400 

Asn Met Arg Val Phe Gin Glu Glu lie Phe Gly Pro Val Leu Ala Val 
405 410 415 

Thr Thr Phe Lys Thr Met Glu Glu Ala Leu Glu Leu Ala Asn Asp Thr 
10 420 425 430 

Gin Tyr Gly Leu Gly Ala Gly Val Trp Ser Arg Asn Gly Asn Leu Ala 
435 440 445 

Tyr Lya Met Gly Arg Gly He Gin Ala Gly Arg Val Trp Thr Asn Cys 
450 455 460 

15 Tyr His Ala Tyr Pro Ala His Ala Ala Phe Gly Gly Tyr Lys Gin Ser 
465 470 475 480 

Gly He Gly Arg Glu Thr His Lys Met Met Leu Glu His Tyr Gin Gin 
485 490 495 

Thr Lys Cys Leu Leu Val Ser Tyr Ser Asp Lys Pro Leu Gly Leu Phe 
20 500 505 510 



<210> 9 
<211> 5267 
<212> DNA 

<213> Klebsiella pneumoniae 
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<223> Location complement 300.. 2153 
<220> 

<223> Location complement 2166.. 2591 
5 <220> 

<223> Locaton complement 2594.. 3034 
<220> 

<223> Location complement 2191.. 4858 



<400> 9 



10 




acat. tcf a tec 


aatttctatg 


cgcacccgt t 


ctcggagcac tgtccgaccg 


SO 






cgcccagtcc 


tgctcgcttc 


gctacttgga 


gccHCtatcg actacgcgat 


120 




ca ggcgacc 


acacccgtcc 


tgtggatctc 


ccac tgac ca 


aagctggccc cggcgacccg 


180 








cgccgaagaa 


aatgagcaat 


ccggtgccaa gaaactcggc 






cacgcactgc 




tagaagtctg 


gttcattatc 


ggcatcctga aatagcacgt 


300 


15 


taaagagaga 


ggctggcgcg 




taattcgcct 


gaccggccag tagcagcccg 


360 




gtggcgaccg 


cattgcgcgg 


cccttctgtt 


ccccgaatat 


tgccctgccc ggcgaccacg 


420 




ccatagtgcg 


acaaggcttc 


cgtgataagc 


tgcgggatct 


caaagtccag cgatgagccg 






cccaccagca 


ccacaaaggc 


gatatcgcga 


atggaaccgc 


cgggtgagac ctggcgeagc 


540 




gcgcgcaggc 


agttggtgac 


aaacactttc 


tctttcgcct 


gccggcgcac gagacgiuitt 


600 


20 


ttttccagcg 


ggctggcgtt 


atcgatcggc 


accagttcgc 


cctccttgat gtacaccact 


660 




ttggcgaaca 


ccgccgggct 


gagggcttcc 


cgaaagaact 


ccaccgcgcc attctcgtga 


720 




cgaatactga 


acaggctttc 


cactttggcc 


agcgggtatt 


tttttatcgc ttccgccagc 


780 




gaaagatcct 


cgaggcccag 


ctcggtttta 


atcaacaggc 


tgaccatatt ccccgccccg 


840 




gcgagatgga 


ccgccgttat 


ctgcccctcc 


gcgttgacga 


tcgccgcatc cgtcgagccg 


900 


25 


gcgccgaggt 


cgaggatcgc 


cagcggcgcc 


gcacagccgg 


gagtggttaa cgccccggcg 


960 




atggccatgt 


tggcctccac 


gccgcccacc 


accacctcgg 


tctgcagtcg ggcgotcagt 


1020 




tcgcgggcga 


taacctgcat 


ttgcagacga 


tccgctttca 


ccatcgccgc catcccgacg 


1080 




gcattctcca 


tggcgcactc 


gccggccatc 


ccgccctgca 


ccttgcgcgg aataaacgta 


1140 




tccaccgcca 


gcagatcctg 


gatgtatatc 


gcgctcatct 


catggccggt cagggacgcc 


1200 


30 


attaccttgc 


gcacccgctc 


aagcatgccg 


ccggcgtggg 


tgcccggttc gccgcggatg 


1260 




tcgcgtaccg 


gagcgcaggc 


gctcatcgcc 


tgcatgatgg 


cttccgcgcc ctcggcgaca 


1320 




tcggcctctc 


cgcggcgctt 


ttcgccgcta 


atgtagaggt 


tgcccgccgg gatcacccgc 


1380 




gactgcacat 


ccccctgcgg 


ggtcttgagc 


accaccgcgg 


aacggttgcc aatcagggcg 


1440 
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cgggcgatgg 


ggacgatggc 


ctgggtctct 




ccgtagggat 


tcgacaggat 


ccgcaccacc 




attaccccct 


cggggacctg 


ctccagcagc 




cgcaggcggt 


tgttcaccag 


cacgccgtcg 


s 


atcccccggt 


cgagcgcctc 


att:gagccac 




tcaatcagta 


cgatccaccc 


ctcggcatac 




agggcgatag 


tcgtccccac 


gccaacgccc 




atcatggtcg 


attcggtgat 


aatggtctcg 




accggcgcgg 


cttcgttaag 


atagatgcga 


10 


gccagggcct 


gctccagcgc 


ggcgagggtc 




cccgtcgtcg 


cgacgatccc 


gctggcaaca 




agcgccacct 


cggtggtggc 


gttgccgata 




tccgcttagc 


ttcctttacg 


cagcttatgc 




acaaaggcgg 


cattcactgt 


cgcatgccag 


15 


agctccgcct 


gcgaggagcg 


gaacgggcgc 




tcaggaatgg 


cgataagctc 


cgccgcgcgg 




atctgctcgg 


caatctgcgc 


ctggtactca 




gggcccacct 


cgccagagag 


caccttctcg 




ggegtcagga 


tatgctccgg 


gcagcgggtg 


20 


ttctcgctca 


tggccactcc 


cttactaagc 




gcaccacatg 


tttggtctct 


ttgatatgaa 




gcaccatctg 


atcgttcacc 


accggcaccg 




cagcgttttt 


gccaatctgc 


cggtaggtct 




gctccaggtt 


gctgagcggc 


agcagatcgc 


25 


ggataccgat 


gccgatcccc 


gagccgctca 




cgtcggacgt 


gcgcagaatg 


cgcaccaccc 




caatcagctc 


tttgaggatc 


gcgccatggg 




tatcgaaggc 


agggccgacg 


ccgatcacca 




ccccgccctc 


gi:;gggttttc 


agggtaaaag 


30 


gaataccgcc 


ttattcaatg 


gtgtcgggct 




cccagcgttc 


ggcagagatg 


cgatagccgg 




tgaccgcact 


caccacctcg 


aactgccgat 




cggtgacccg 


ctggcgcagc 


atattgagaa 




ggctcagcgc 


gccgacaata 


tcgaggccgg 


35 


cactcagatc 


ctccaccacg 


ttacgcggcg 




cggcctccac 


ctcctcgtcg 


gcgattggcg 




tcgcccgcgc 


cgctttctgg 


cgaatggcaa 



tccgggctta gcccgaagaa ggtggcgatc 1500 
tggcccggcg cggccacttc caccgccgcc 1560 
gtcacttcat ccaccaccgg cagggtttta 1620 
tcctttttga ggatcgccgc caccacgttg 1680 
cacacggcgt caaggaaatc gacggcgtcg 1740 
tgcgccgccg gcagcgtcgc cagccgcccg 1800 
accccgcccg gcgtctgcgg gttatgaccg 1860 
gtgatggtct ccatcgccac atcgccaatc 1920 
gagacatcgc tcatcgacca cggtgttttc 1980 
ccggcgatat tgtcccgcgt ccctttcatg 2040 
aacgccctcg cctgcgggta gtcggacgcc 2100 
tcaatcccgg ctattaacgg catgctgacc 2160 
cgctgctgat acacttccgc cgactcccgg 2220 
gtgtgctcca gctcgtcggc gatcgccagc 2280 
agcgcgttat agatagccag aatgcgctcg 2340 
cggaaattgc gcgccaccgc atggcgctgc 2400 
agggtctggc gggagatccg cacatcctgc 2460 
agggtaatat cggtcaatgg tttgccggta 2520 
gctaacggat aatcctgcac gcgcatggtt 2580 
cgatgegcag ggtgacggge tcggcgtect 2640 
atagcgcggc tttggccata aatttcggcc 2700 
gcgaaggtga ctctttgcgc gcatagcgcg 2760 
ccagcgtcag cagcggcgcc tgggagaaca 2820 
gctgatggat gaccgtggtc cccttcgact 2880 
ggctggccgc ateccaggcc ataaaggaga 2940 
gggcgtgaag cccctcttct tccaccccgg 3000 
gcatatcgat cagagtgtga tgctggtgtt 3060 
cttcatcggc gcgttcatcg gcagaagcta 3120 
agggctgaat ttgggttgtc tgttgcacag 3180 
gaaccacgcc cggaatattt ttgatctccg 3240 
tgcccggccc ctgatagtca ttgatgtcgt 3300 
cgagsiatggc cgaggtctgc aggtaatcgc 3360 
tattgctggc gatatcctca aagccgctgc 3420 
tgatgttgcg cttcatcatc tcttccaccg 3480 
gcatctcgtt gctgccgtgc gcgtaggtgg 3540 
gcagccccag ctcgcggaaa accgcctgga 3600 
tggtttccgc ctcggtcacc ggacgcaggc 3660 
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cgccgtcaac catcaggtca cgctgcagga tgttgtaatc atcaaaatct tccgcatcga 3720 
agttcgagcc ggcgaacatg ttgtcgtagt tcggcaccgc gctgtagccg gagaaaataa 3780 
agtcggtgcc cggcagcatc tgcatcaggg tgcgcgcggt gcggcgaata tccgagtggg 3840 
agaaagtctg gtcgttggcg gacgccaett cgaggtcgag catagaggcg atcaggtttt 3900 
S ccgccagcac cgcccgaatg cccgacggca cagcgccggt catgccgata cagctcaccg 3960 
cgccgttttg cagtccctga accccggcgc ctttagtaat gaagatgcag cgcgattcga 4020 
ggtagagcat cgacttgctc tccgaatagc ccatcagcgc ttcggatccg gtgccggagg 4080 
tgtagcgcat tttcaacccg cgggaggcgt aggccgaggc gaggaacgcc tttgaccacg 4140 
gcgtatcatc gccgtcggta aataccgctt cggtgccgta gaccgacacc gtctcggcgt 4200 

10 agctggttaa gccacgcatg cccagctcca gctcggtggc ctcttccacc gagcactgcg 4260 
tcaacacgcc ggggcggccg cactgcgaac cgaccaacag cgccagggcg ttaaacggcg 4320 
cgtagcgcgc gataccgacc gtggtctcct gttctgagaa gccgcggatc ccggcctcgg 4380 
cggcgtcagc ggcaatctgc accggattat ctttgagatt ggtgacgtgg cactggttgg 4440 
agggggcccg gcgggcacgc atcttctgca gcgccatcat catctccacc acgttcatct 4500 

IS gcgccatcac ctcgaccgct ttggccggcg tgatggcggt agtgatggca atgatctcct 4560 
cccggctgac gtgaatatcc accagcatac gggctatttc caccgcctcc aggcgcattg 4620 
cctgctctgt gcgctcaacg ttgatcgcgt aatcggcgat aaatcggtcg atcatgtcaa 4680 
actggccccg gcgtttgccg tccagttcga cgatcagacc gttgtccact tttactgaag 4740 
agaccgggtc aaaggggctg tccatggcga tcagcccctc ttcaggccac tcgccaatca 4800 

20 gcccgtcctg attgacgggg cgctgggcca gtactgcaaa tcgttttgat cttttcattg 4860 
ttcatcggct caaaaggtga aacccgcaga cggtagcgaa tacgccgggc cagcgtcgtt 4920 
gccgcccggc cattaccggc aatagcggaa ctttaaatga gccagtggtg aaaaaaataa 4980 
atttaatttc gtttcaattt ggcacacgaa atctaccgac agtttcacta tgaaacttta S040 
c^ccggcggc aaaaataaaa aatgtgatcg cccgcaatga tataaatcaa ttaataaaaa SlOO 

25 acgcccttaa ttacgttttt ccgacgctat tttaacccta ttgactaaat catggcgggc 5160 
gacazuiataa cgctgacaaa aacaaagcaa gccaaccgaa tggtaatagt tttttactat 5220 
cgccccctac tgactattcg cgccagcgtt atcctggtgc gggagaga 5268 



<210> 10 

<211> 607 

30 <212> PRT 

<213> Klebsiella pneumoniae 

<400> 10 

Met Pro Leu He Ala Qly He Asp He Oly Aan Ala Thr Thr Glu Val 
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Ala Leu Ala Ser Asp Tyr Pro Gin Ala Arg Ala Phe Val Ala Ser Gly 
20 25 30 

lie Val Ala Thr Thr Gly Met Lys Qly Thr Arg Asp Asn lie Ala Gly 
35 40 45 

5 Thr Leu Ala Ala Leu Glu Gin Ala Leu Ala Lys Thr Pro Trp Ser Met 
50 55 60 

Ser Asp Val Ser Arg lie Tyr Leu Asn Glu Ala Ala Pro Val lie Gly 
65 70 75 80 

Asp Val Ala Met Glu Thr lie Thr Glu Thr lie He Thr Glu Ser Thr 
10 85 90 95 

Met He Gly His Asn Pro Gin Thr Pro Gly Gly Val Gly Val Gly Val 
100 105 110 

Gly Thr Thr He Ala Leu Gly Arg Leu Ala Thr Leu Pro Ala Ala Gin 
115 120 125 

IS Tyr Ala Glu Gly Trp He Val Leu He Asp Asp Ala Val Asp Phe Leu 
130 135 140 

Asp Ala Val Trp Trp Leu Asn Glu Ala Leu Asp Arg Gly He Asn Val 
145 150 155 160 

Val Ala Ala He Leu Lys Lys Asp Asp Gly Val Leu Val Asn Asn Arg 
20 165 170 175 

Leu Arg Lys Thr Leu Pro Val Val Asp Glu Val Thr Leu Leu Glu Gin 
180 185 190 

Val Pro Glu Gly Val Met Ala Ala Val Glu Val Ala Ala Pro Gly Gin 
195 200 205 
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Val Val Arg lie Leu Ser Asn Pro Tyr Qly lie Ala Thr Phe Phe Gly 
210 215 220 

Leu Ser Pro Glu 61u Thr Gin Ala He Val Pro He Ala Arg Ala Leu 
225 230 235 240 

S He Gly Asn Arg Ser Ala Val Val Leu Lys Thr Pro Gin Gly Asp Val 
245 250 255 

Gin Ser Arg Val He Pro Ala Gly Asn Leu Tyr He Ser Gly Glu Lys 
260 265 270 

Arg Arg Gly Glu Ala Asp Val Ala Glu Gly Ala Glu Ala He Met Gin 
10 275 280 285 

Ala Met Ser Ala Cys Ala Pro Val Arg Asp He Arg Gly Glu Pro Gly 
290 295 300 

Thr His Ala Gly Gly Met Leu Glu Arg Val Arg Lys Val Met Ala Ser 
305 310 315 320 

IS Leu Thr Gly His Glu Met Ser Ala He Tyr He Gin Asp Leu Leu Ala 
325 330 335 

Val Asp Thr Phe He Pro Arg Lys Val Gin Gly Gly Met Ala Gly Glu 
340 345 350 

Cys Ala Met Glu Asn Ala Val Gly Met Ala Ala Met Val Lys Ala Asp 
20 355 360 365 

Arg Leu Gin Met Gin Val He Ala Arg Glu Leu Ser Ala Arg Leu Gin 
370 375 380 

Thr Glu Val Val Val Gly Gly Val Glu Ala Asn Met Ala He Ala Gly 
365 390 395 400 
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Ala lieu Thr Thr Pro Gly Cys Ala Ala Pro Leu Ala lie Leu Asp Leu 
405 410 415 

Gly Ala Gly Ser Thr Asp Ala Ala He Val Asn Ala Glu Gly Gin He 
420 425 430 

S Thr Ala Val His Leu Ala Gly Ala Gly Asn Met Val Ser Leu Leu He 
435 440 445 

Lys Thr Glu Leu Gly Leu Glu Asp Leu Ser Leu Ala Glu Ala lie Lys 
450 455 460 

Lys Tyr Pro Leu Ala Lys Val Glu Ser Leu Phe Ser He Arg His Glu 
10 465 470 475 480 

Asn Gly Ala Val Glu Phe Phe Arg Glu Ala Leu Ser Pro Ala Val Phe 
485 490 495 

Ala Lys Val Val Tyr He Lys Glu Gly Glu Leu Val Pro He Asp Asn 
500 505 510 

15 Ala Ser Pro Leu Glu Lys He Arg Leu Val Arg Arg Gin Ala Lys Glu 
515 520 525 

Lys Val Phe Val Thr Asn Cys Leu Arg Ala Leu Arg Gin Val Ser Pro 
530 535 540 

Gly Gly Ser He Arg Asp He Ala Phe Val Val Leu Val Gly Gly Ser 
20 545 550 555 560 

Ser Leu Asp Phe Glu He Pro Gin Leu He Thr Glu Ala Leu Ser His 
565 570 575 

Tyr Gly Val Val Ala Gly Gin Gly Asn He Arg Gly Thr Glu Gly Pro 
580 585 590 
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Arg Asn Ala Val Ala Thr Qly Leu IjCU Leu Ala 61y Oln Ala Asn 
595 600 605 



<210> 11 
<211> 141 
S <212> PRT 

<213> Klebsiella pne\imoniae 

<400> 11 

Met Ser Glu Lys Thr Met Arg Val Gin Asp Tyr Fro Leu Ala Thr Arg 
15 10 15 

10 Cys Pro Glu His lie Leu Thr Pro Thr Oly Lys Pro Leu Thr Asp lie 
20 25 30 

Thr Leu Glu Lys Val Leu Ser Gly Glu Val Gly Pro Gin Asp Val Arg 
35 40 45 

lie Ser Arg Oln Thr Leu Glu Tyr Gin Ala Gin He Ala Glu Gin Met 
IS 50 55 60 

Gin Arg His Ala Val Ala Arg Asn Phe Arg Arg Ala Ala Glu Leu He 
65 70 75 80 

Ala He Pro Asp Glu Arg lie Leu Ala He Tyr Asn Ala Leu Arg Pro 
85 90 95 

20 Phe Arg Ser Ser Gin Ala Glu Leu Leu Ala He Ala Asp Glu Leu Glu 
100 105 110 

His Thr Trp His Ala Thr Val Asn Ala Ala Phe Val Arg Glu Ser Ala 
lis 120 125 

Glu Val Tyr Gin Gin Arg His Lys Leu Arg Lys Gly Ser 
25 130 135 140 
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<210> 12 
<211> 146 
<212> PRT 

<213> Klebsiella pneumoniae 
S <400> 12 

Met Fro His Gly Ala He Leu Lys 61u Leu He Ala Gly Val Qlu Glu 
15 10 15 

Glu Gly Leu Bis Ala Arg Val Val Arg He Leu Arg Thr Ser Asp Val 
20 25 30 

10 Ser Phe Met Ala Trp Asp Ala Ala Asn Leu Ser Gly Ser Qly He Gly 
35 40 45 

He Gly He Gin Ser Lys Gly Thr Thr Val He His Gin Arg Asp Leu 
SO 55 60 

Leu Pro Leu Ser Asn Leu Glu Leu Phe Ser Gin Ala Pro Leu Leu Thr 
IS 65 70 75 80 

Leu Glu Thr Tyr Arg 61n He Gly Lys Asn Ala Ala Arg Tyr Ala Arg 
85 90 95 

Lys Glu Ser Pro Ser Pro Val Pro Val Val Asn Asp Gin Met Val Arg 
100 105 110 

20 Pro Lys Phe Met Ala Lys Ala Ala Leu Phe His He Lys Glu Thr Lys 
115 120 125 

His Val Val Gin Asp Ala Glu Pro Val Thr Leu His He Asp Leu Val 
130 135 140 

Arg Glu 
25 145 
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<210> 13 
<211> 555 
<212> PRT 

<213> Klebsiella pneumoniae 
5 <400> 13 

Met Lys Arg Ser Lys Arg Phe Ala Val Leu Ala Gin Arg Pro Val Asn 
15 10 15 

Oln Asp 61y Leu He Gly Glu Txp Pro Glu Glu Oly Leu He Ala Met 
20 25 30 

10 Asp Ser Pro Phe Asp Pro Val Ser Ser Val Lys Val Asp Asn Gly Leu 
35 40 45 

He Val Glu Leu Asp Gly Lys Arg Arg Asp Gin Phe Asp Met He Asp 
50 55 60 

Arg Phe He Ala Asp Tyr Ala He Asn Val Glu Arg Thr Glu Gin Ala 
IS 65 70 75 80 

Met Arg Leu Glu Ala Val Glu He Ala Arg Met Leu Val Asp He His 
85 90 95 

Val Ser Arg Glu Qlu He He Ala He Thr Thr Ala He Thr Pro Ala 
100 105 110 

20 Lys Ala Val Glu Val Met Ala Gin Met Asn Val Val Glu Met Met Met 
lis 120 125 

Ala Leu Gin Lys Met Arg Ala Arg Arg Thr Pro Ser Asn Gin Cys His 
130 135 140 

Val Thr Asn Leu Lys Asp Asn Pro Val Gin He Ala Ala Asp Ala Ala 
25 145 ISO 155 160 
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Glu Ala Gly He Arg Gly Phe Ser Glu Gin Glu Thr Thr Val Gly He 
165 170 175 

Ala Arg Tyr Ala Pro Phe Asn Ala Leu Ala Leu Leu Val Gly Ser Gin 
180 185 190 

5 Cys Gly Arg Pro Gly Val Leu Thr Gla Cys Ser Val Glu Glu Ala Thr 
195 200 205 

Glu Leu Glu Leu Gly Met Arg Gly Leu Thr Ser Tyr Ala Glu Thr Val 
210 215 220 

Ser Val Tyr Gly Thr Glu Ala Val Phe Thr Asp Gly Asp Asp Thr Pro 
10 225 230 235 240 

Trp Ser Lys Ala Phe Leu Ala Ser Ala Tyr Ala Ser Arg Gly Leu Lys 
245 250 255 

Met Arg Tyr Thr Ser Gly Thr Gly Ser Glu Ala Leu Met Gly Tyr Ser 
260 265 270 

IS Glu Ser Lys Ser Met Leu Tyr Leu Glu Ser Arg Cys Zle Phe He Thr 
275 280 285 

Lys Gly Ala Gly Val Gin Gly Leu Gin Asn Gly Ala Val Ser Cys He 
290 295 300 

Gly Met Thr Gly Ala Val Pro Ser Gly He Arg Ala Val Leu Ala Glu 
20 305 310 315 320 

Asn Leu He Ala Ser Met Leu Asp Leu Glu Val Ala Ser Ala Asn Asp 
325 330 335 

Gin Thr Phe Ser His Ser Asp He Arg Arg Thr Ala Arg Thr Leu Met 
340 345 350 
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Qln Met Leu Pro Gly Thr Asp Phe lie Phe Ser Gly Tyr Ser Ala Val 
355 360 365 

Pro Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Phe Asp Ala Glu Asp 
370 375 380 

5 Phe Asp Asp Tyr Asn lie Leu Gin Arg Asp Leu Met Val Asp Gly Gly 
385 390 395 400 

Leu Arg Pro Val Thr Glu Ala Glu Thr He Ala He Arg Gin Lys Ala 
405 410 415 

Ala Arg Ala He Gin Ala Val Phe Arg Glu Leu Gly Leu Pro Pro He 
10 420 425 430 

Ala Asp Glu Glu Val Glu Ala Ala Thr Tyr Ala His Gly Ser Asn Glu 
435 440 445 

Met Pro Pro Arg Asn Val Val Glu Asp Leu Ser Ala Val Glu Glu Met 
450 455 460 

15 Met Lys Arg Asn He Thr Gly Leu Asp He Val Gly Ala Leu Ser Arg 
465 470 475 480 

Ser Gly Phe Glu Asp He Ala Ser Asn He Leu Asn Met Leu Arg Gin 
485 490 495 

Arg Val Thr Gly Asp Tyr Leu Gin Thr Ser Ala He Leu Asp Arg Gin 
20 500 505 510 

Phe Glu Val Val Ser Ala Val Asn Asp He Asn Asp Tyr Gin Gly Pro 
515 520 525 

Gly Thr Gly Tyr Arg He Ser Ala Glu Arg Trp Ala Glu He Lys Asn 
530 535 540 
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He Pro Gly Val Val Gin Pro Asp Thr He Glu 
545 550 555 



<210> 14 

<211> 56 

5 <212> DKA 

<213> Escherichia coli 



<400> 14 

gctaccatgg cttaaccggt accaaggaga tatcatatgt cagtacccgt tcaaca 56 



<210> 15 
10 <211> 59 
<212> DNA 

<213> Escherichia coll 
<400> 15 

gcctcgagtc tagagccgtc gacgggaatt cgagctctta agactgtaaa taaaceacc 59 



IS <210> 16 
<211> 46 
<212> DNA 

<213> Saccharomyces cerevisiae 
<400> 16 

20 gcggtaccaa ggaggtatca tatgttcagt agatctacgc tctgct 



<210> 17 
<211> 33 
<212> DNA 

<213> Saccharomyces cerevisiae 
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<400> 17 

gcgaattcga gctcttactc gtccaatttg gcac 



<210> 18 
<211> 35 
S <212> DNA 

<213> Homo sapiens 

<400> 18 

gcggtaccaa ggaggtatca tatgtcagcc gccgc 



<210> 19 
10 «211> 39 
<212> DMA 
<213> Homo sapiens 

<400> 19 

gcgaattcga gctcttatga gttcttctga ggcactttg 39 



IS <210> 20 
<211> 44 
<212> DNA 

<213> Escherichia coli 
<400> 20 

20 gcggtaccaa ggaggtatca tatgaccaat aatccccctt cage 



<210> 21 
<211> 38 
<212> DNA 

<213> Escherichia coli 



-40- 



wo 01/16346 



PCT/USOO/23878 



<400> 21 

gcgaattcga gctcttagaa cagccccaac ggtttatc 



<210> 22 
<2X1> 20 
S <212> DNA 

<213> Escherichia coli 



<400> 22 

atcccgccgt taaccaceat 



<210> 23 
10 <211> 34 
<2X2> DVOi 

<213> Escherichia coli 
<400> 23 

gcggtaccat tgttatccgc tcacaattcc acac 34 



IS QBMAD\223318 
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